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An  IntroduotloiTtoHOt^ 

^  A  new  Hardware  Speci^tion  Language  (HSL)  called  HOP  is  presented.  HOP  stands  for 
Hardware  viewed  as  Objects  and  Processes.  It  can  be  used  for  specifying  the  structure, 
behavior,  and  t^ny  tA  digital  systems.-^ 

^We-  designed  HOP/for  several  reasons.  It  migrates  weU-testaUdeasJ^ni  har  past  work 
[QS96T,GopM,'Qup87]  that  was  based  on  an  abstract  data  typd>{GHM78]-  view  of  hardware 
S3«teins  into  a  new,  simple,  and  deterministic  process  model  that  we  have  invented.  Our 
process  model  is  inspired  by  the  works  of  (Mil82],  [Mil83],  and  [Hoa85]. 

SecondlyVe  believe  that  not  only  should  an  HSL  be  founded  in  mathematical  principles, 
but  it  also  ought  to  be  simple,  intuitive  to  use,  and  address  practical  issues,  especially  if 
y'  practicing  VLSI  designers  are  to  be  encouraged  to  use  them. 

HOP  was  designed  to  meet  the  following  design  objectives: 

1)  Be  capable  of  modeling  large  architecttu'es  as  well  as  simple  MOS  digital  circuits; 

2)  support  the  writing  of  a  priori  as  well  as  a  posteriori  specifications; 

3)  possess  a  simple  and  rigorous  semantics; 

4)  support  static  analysis  techniques  and  design  verification; 

5)  match  digital  designer’s  intuitions  closely; 

6)  be  demonstrably  efficient  in  handling  many  important  practical  issues; 

7)  act  as  a  common  repository  of  related  information  falling  in  various  domains  (functional 
behavior,  timing,  geometry,  and  user  documentation  to  name  a  few)  thereby  helping 
in  designer  and  tool  integration; 

\  v.-  K  ■ 


s)  support  design  automation  as  well  as  manual  design. 


In  this  paper  we  show  how  HOP  meets  objectives  2,  3,  4,  5,  and  6.  Objectives  1  and  7  will  be 
addressed  in  the  process  of  specifying  a  large  Application  Specific  IC  (ASIC)  called  the  “Roll 
Back  Chip”  [FTG88a,Gop]  that  we  are  currently  engaged  in.  Objective  8  will  be  addressed 
in  our  ongoing  work  on  implementing  a  VLSI  design  system  centered  around  HOP. 


1.1  Features  of  HOP,  in  a  Nutshell 

Let  us  take  an  informal  approach  similar  to  [Mil80,  Page  10]  to  intuitively  understand  HOP. 
The  externally  observable  features  of  every  hardware  module  modeled  in  HOP  consist  of  a  set 
of  “light  actuated  sensors”  (input  events),  “a  set  of  lamps”  (output  events),  a  set  of  “output 
conduits”  that  bring  out  data  items,  and  a  set  of  “input  conduits”  that  can  consume  data 
items.  In  addition,  each  module  maintains  “a  notebook”  (the  internal  datapath  state)  that 
maintains  a  complete  record  of  all  its  input  events  and  input  data  port  values  in  as  concise  a 
form  as  possible.  The  notebook  (internal  datapath  state)  is  visible  to  human  observers  but 
not  to  other  modules.  There  can  be  modules  of  an  extreme  variety  too;  foi  instance  those 
that  have  only  sensors  and  lamps  and  no  internal  notebooks  (e.g.  controll/  r^).  The  values 
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of  datft  path  states  and  the  values  shipped  through  conduits  belong  to  one  of  the  data  types 
supported  by  HOP,  such  as  queues,  trees,  bytes,  or  a  user-defined  data  type. 

A  human  observer  may  observe  the  following  behavior  of  every  module:  its  current 
output  lamp  statusses  and  output  conduit  values  are  entirely  predictable  from  its  current 
input  sensor  statusses,  input  conduit  values  and  notebook  contents.  Further,  its  notebook 
contents  at  the  next  time  step  as  well  as  its  entire  future  behavior  are  also  predictable.  In 
other  words  HOP  processes  are  /unctions  from  current  input  events,  data  values,  and  data 
path  state  to  HOP  processes.  (They  are  deterministic.). 

At  each  time  step  a  module  M  either  does  or  does  not  wait  for  one  of  its  input  sensors 
to  be  actuated  (i.e.  it  offers  a  deterministic  choice  of  input  events).  If  it  offers  such  a 
choice,  only  one  input  sensor  may  be  actuated.  When  so  actuated  by  another  module  A, 
synchronization  is  achieved  between  M  and  A.  M  may  then  produce  and  consume  data 
values  through  its  conduits  at  the  current  time  step  and  makes  progress  in  its  computation. 
If  M  waits  for  a  sensor  to  be  actuated  but  none  is  actuated,  its  current  actions  and  future 
computation  are  both  undefined  (fsilure  to  synchronize).  On  the  other  hand  if  M  doesn’t 
wait  for  any  of  its  input  sensors  to  be  actuated,  it  may  make  progress  in  its  computation 
Autonomously  after  having  produced  and/or  consumed  data  items  at  the  current  time  step. 
We  assume  that  light  from  lamps  reach  light-sensors  instantaneously. 

A  collection  of  interacting  modules  is  formed  through  p&rsUel  composition,  "H”.  Collec¬ 
tions  of  interacting  modules  can  make  progress  in  their  computation  only  through  synchro¬ 
nizations;  i.e.  every  input  sensor  awaited  by  a  module  to  be  actuated  should  actually  be 
actuated  by  some  other  module.  All  the  actions  through  events  and  ports  in  a  system  of 
modules  happen  at  the  same  rate  (as  in  [Mil82]).  If  a  module  is  busy  performing  internal 
computation,  it  will  be  regarded  as  outputting  an  output  event  called  “idle”. 

Not  all  sensors  and  lamps  are  alike;  they  have  different  colors.  A  lamp  and  sensor 
may  interact  either  if  they  have  the  same  colors  to  begin  with,  or  if  they  are  imparted  the 
same  colors  (this  is  called  renanung).  A  lamp  of  a  given  color  can  actuate  (virtually)  an 
infinite  number  of  sensors  of  that  color  at  the  time  when  it  shines  (events  have  a  broadcast 
semantics.)  Two  lamps  of  the  same  color  flashing  at  the  same  time  is  equivalent  to  one  lamp 
flashing  with  that  color. 

Input  and  output  data  conduits  are  meant  to  be  connected  amongst  themselves.  A  given 
architecture  has  a  specific  “plumbing”  of  the  conduits.  No  synchronization  is  defined  for  data 
transfers  through  conduits.  Thus  one  module  may  put  out  a  value  without  one  sampling 
this  value,  or  vice  versa.  However  noodules  usually  achieve  synchronization  through  their 
flashing  lights  and  thereafter  meaningfully  interact  through  their  conduits.  Output  conduits 
have  a  broadcast  semantics.  Two  output  conduits  connecting  to  a  node  may  not  assert  two 
incompatible  values  to  the  node  (defined  via  a  function  bus  that  computes  the  least  upper 
bound  of  the  values  involved  over  a  strength  lattice.).  Most  conduits  are  assigned  specific 
directions  to  begin  with;  it  is  possible  to  have  perfectly  directionless  conduits  too. 

Selected  lamps,  sensors,  and  conduits  may  be  hidden  from  a  collection  of  interconnected 
modules  Mi.  The  collection  Mi  may  then  be  viewed  as  a  single  module  M  that  possesses 
only  those  events  and  ports  that  are  not  hidden.  K  a  submodule  within  M  waits  for  a  sensor 
with  color  c  to  be  actuated,  no  other  submodule  within  M  produces  a  light  of  color  c,  and 
if  events  of  color  c  are  not  part  of  M's  interface  (due  to  hiding),  then  M's  computation  is 
undefined.  The  same  goes  for  a  lamp  that  shines  within  M  without  actuating  any  sensor 
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withio  M  and  hidden  from  Af ’s  interface. 

However  a  lamp-sensor  pair  that  communicate  within  M  may  be  hidden  without  any 
risk.  The  contribution  of  this  bidden  and  communicating  lamp-sensor  pair  to  the  interface 
of  Af  is  just  an  *‘idle'’  event. 

Relationship  to  Hardware  Modeling 

Modules  in  HOP  are  black-boxes  that  are  understood  and  used  only  in  terms  of  their  in¬ 
terface.  The  interface  consists  of  data  ports^  events,  and  a  protocol  specification  that  uses 
events  and  asserts/queries  values  to/from  ports. 

Events  are  realized  as  different  combinations  of  control  wires  or  as  predicates  defined 
over  data  conduits.  Module  await  either  command  events  or  status  events.  Data  conduits 
are  realized  as  bus  structures  that  deliver  the  sanoe  data  items  at  the  receiving  end  as  items 
sent  at  the  sending  end  (t.e.  the  busses  do  not  have  any  wire- permutations,  tappings,  etc.). 

HOP  is  usefiil  for  writing  both  requirements  (a  priori)  specifications  and  design  (a  poste¬ 
riori)  specifications.  The  manner  in  which  requirements  are  expressed  has  usually  no  bearing 
on  the  actual  implementation  chosen  later.  Design  specifications  capture  known  facts  about 
a  system  that  has  been  built  or  has  been  designed  in  detail.  In  a  HOP  based  design  method¬ 
ology’,  design  proceeds  hierarchically,  and  on  many  occasions  (but  not  always)  top-down.  For 
most  large  systems,  the  requirements  specification  consists  of  the  specification  of  a  collection 
of  modules  and  not  one  module;  for  these  systems,  the  single  nnodule  view  is  only  derived  a 
posteriori. 

Requirements  specifications  are  usually  written  without  knowing  at  least  two  detailed 
aspects:  (i)  the  details  of  the  functional  behavior  (I/O  mappings)  of  module(s);  (ii)  the 
details  of  the  temporal  behavior  of  module(s).  Abstraction  mechanisms  permit  modeling 
systems  completely  despite  missing  details.  We  employ  two  important  abstraction  mecha¬ 
nisms:  (i)  data  abstractions,  to  model  the  functional  aspects  of  the  I/O  port  values  as  well 
as  data  path  state-,  (ii)  temporal  abstractions  in  the  form  of  a  protocol  description  consisting 
of  events  and  event  sequences  that  describe  the  control  aspects. 

Most  of  our  applications  of  HOP  to  date  (as  well  as  the  examples  in  this  paper)  pertain 
to  synchronous  hardware  systems,  t.e.  systems  in  which:  (i)  the  computational  rates  of  the 
modules  are  the  same;  (ii)  communications  between  modules  are  lockstep  synchronous  with 
a  global  clock.  While  writing  the  requirements  specifications  for  these  systems  however, 
not  enough  may  be  known  about  the  clocking  aspects.  In  these  cases,  we  would  pretend 
as  if  these  synchronous  systenos  were  actually  asynchronous  systems — ^those  in  which  all 
synchronizations  between  events  happen  via  handshaking.  Later  on  when  a  design  in  the 
synchronous  style  is  produced,  most  of  these  “handshakes”  happen  implicitly,  t.e.  without 
actually  exchanging  any  signals,  but  via  bard-wired  assumptions  built  into  modules.  However 
HOP  encourages  making  these  hard- wired  timing/ synchronization  assumptions  explicit  via 
the  introduction  of  events. 

In  HOP  one  could  write  a  module  requirements  specification  and  later  replace  it  by  a 
collection  of  module  requirements  specifications.  It  is  possible  to  check  whether  the  collection 
is  observationally  equivalent  to  the  (original)  single  module.  Design  specifications  may  also 
be  written  in  HOP.  Design  specifications  include  details  that  closely  match  the  details  of 
the  ultimate  hardware.  Thus  typical  design  specifications  of  synchronous  hardware  systems 
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Figure  2:  A  master/slave  Flip-Flop,  FF 


would  include  clocks. 

1.2  What  is  Attractive  About  HOP? 

A  variety  of  existing  as  well  as  new  ideas  Have  been  integrated  into  HOP.  We  discuss  these 
ideas  with  the  aid  of  the  example  in  figure  2. 

•  Highlighting  Useful  Input  Combinations  Thru  Events:  The  circuit  in  figure  2  may  be 
viewed  at  different  levels.  At  first  glance  it  is  an  analog  circuit.  However  it  will  be  used  as  a 
digital  system — specifically  a  master/slave  flip-flop.  Therefore  not  all  logical  combinations 
of  the  signals  load,  copy,  and  arc  are  useful.  Only  three  events  prove  to  be  useful: 

Id  =Hoad  A  ~>1copy  A  -<?drc — loads  FF; 
cp  ='toopy  A  -flload  A  -'?drc — copies  data  to  slave; 
cr  ^Idrc  A  ->?load  A  ->?copy — restores  the  charge. 

HOP  encourages  the  identification  and  declaration  of  such  events.  This  gives  rise  to  specifi¬ 
cations  that  are  easier  for  humans  to  understand  and  easier  for  programs  to  institute  static 
checks  on. 

•  Incorporation  of  Multiple  Clocks:  It  is  important  to  be  able  to  incorporate  multi-phase 
clocks  and  classify  events  based  on  them.  Consider  the  usage  of  FF  in  a  two-phase  clocked 
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Figure  3:  Excerpts  from  the  Process  Diagram  of  FF 

system  that  hats  two  non-overlapping  clocks  a  and  b.  It  would  then  be  necessary  to  generate 
the  above  events  only  during  specific  clock  phases.  HOP  allows  this  to  be  specified  thus. 

Id  =  a  A  load  A  ->copy  A  ~>circ 

•  Highlighting  Useful  Event  Sequences  (“Modes  of  Behavior”):  Only  certain  sequences  of  the 
events  ld,cp  and  cr  are  of  interest.  Figure  3  shows  these  sequences.  What  we  have  shown 
in  this  figure  is  actually  salient  excerpts  from  the  process  diagram  of  the  FF.  The  above 
diagram  can  be  expressed  in  the  syntax  of  HOP  as: 

FF  <■  Id  ->  cp  ->  FF 
I  cr  ->  cp  ->  FF 

•  Detection  of  a  Class  of  Sequencing  Errors  Statically:  HOP  forces  designers  to  state  the 
sequences  of  events  of  interest.  The  system  can  flag  an  error  should  an  unspecified  sequence 
manifest.  This  is  easy  to  do  due  to  the  synchronization  semantics  of  events.  Thus  for 
FF,  a  sequence  Id,  cp  does  not  constitute  a  useful  nK>de  of  behavior;  were  such  a  sequence 
be  applied,  it  would  be  regarded  as  a  sequencing  error.  In  many  traditional  approaches, 
sequencing  errors  are  detected  in  the  process  of  simulating  a  circuit;  to  be  assured  of  the 
detection  of  all  lurking  sequencing  errors,  a  very  large  number  of  simulation  test  cases  have 


6 


ABSPROC  FF 

PORTS  ?dia,  fdout,  ?ld,  ?cr,  ?cp  :  bit 
CLOCKS  t«oplias«(4.b) 

EVENTS  Id  ■  ?load  /\  a  A  not(?circ)  /\  not(?copy) 

cp  ■  ?copy  /\  b  /\  iiot(?circ)  A  not(?load) 

cr  ■  ?circ  /\  a  /\  aot(?load)  /\  not(?copy) 

PROTOCOL 

FFCdpg]  <■  Id,  z*?din,  !dout«dps  ->  cp  ->  FF[x] 

I  cr,  !dottt*dps  ->  cp  ->  FFCdpa] 

END  FF 

Figure  4:  The  Complete  Absproc  SpecihcatioD  of  FF 

to  be  applied.  Even  then,  sequencing  errors  are  not  directly  noticed,  but  have  to  be  deduced 
through  bacJrwards  reasoning  from  an  observed  anomalous  behavior,  such  as  two  values 
clashing  on  a  bus.  Many  sequencing  errors  can  be  detected  during  the  process  of  composing 
two  HOP  processes  using  an  Algorithm  called  PARCOMP  that  we  have  developed. 

•  Separstion  of  Dsts  And  Control,  And  IncorporAthn  of  Data  I/O  into  SpecihcAtions:  In 
the  design  of  digital  systems,  architects  use  their  intuitions  to  separate  data  related  aspects 
from  control  related  aspects.  We  believe  that  an  HSL  must  support  this  separation  process. 
Data  aspects  may  be  loosely  defined  as  those  modes  of  behavior  that  are  unaffected  by  the 
datapath  states.  Consider  a  stack  as  an  example.  For  all  its  data  path  states  where  the 
stack  is  neither  full  nor  empty,  the  same  control  recipe  suffices.  Thus,  by  separating  data 
from  control,  we  again  impart  a  good  structure  to  the  specification. 

•  Highlighting  Data  ReUted  Protocols:  Continuing  with  the  example  of  FF,  the  following 
important  questions  must  be  clearly  answered  by  its  specification: 

•  When  may  a  user  (possibly  another  module)  read  FF? 

•  when  may  the  user  write  into  FF? 

Answers  to  such  questions  form  the  usage  protocol  of  FF.  Usage  protocols  are  usually 
complex;  for  instance  FF  may  be  reliably  read  even  while  it  is  being  loaded.  We  embellish 
the  process  diagram  3  to  include  such  additional  pieces  of  information  as  annotations.  It 
results  in  figure  4  which  is  a  complete  HOP  specification  of  FF. 

This  specification  may  be  read  as  follows.  FF  is  initially  in  control  state  FF  and  datapath 
state  dpa  (a  one-bit  quantity).  It  offers  the  choices  Id  and  cr  to  the  external  world.  If 
the  external  world  asserts  Id,  the  input  data  item  is  also  expected  to  be  supplied  at  the 
same  time  through  the  data  input  port  ?din.  This  is  written  as  z^Tdin,  z  being  the  value 
supplied.  Despite  loading  z,  the  output  port  !dout  continues  to  remain  at  its  original  value 
which  is  equal  to  the  internal  data  path  state  dps.  FF  then  advances  to  a  new  control  state 
(indicated  by  ->)  where  it  awaits  the  event  cp.  This  is  generated  during  phase  b  of  the 
two-phase  clock.  Thereafter  FF  goes  back  to  the  control  state  FF  but  in  data  path  state  z. 

Similarly  we  may  consider  the  path  starting  from  FFCdps]  labeled  via  cr  and  coming 
back  to  FF.  In  this  case,  FF  doesn’t  suffer  any  state  changes  nor  does  it  load  any  input  values. 
Although  this  example  doesn’t  highlight  the  use  of  abstract  data  types,  in  general  data 
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Figure  5:  HOP  Provides  a  Compositional  Model 

path  states  will  be  modeled  using  high-level  abstract  data  types  (user’s  may  introduce  new- 
abstract  data  types  into  HOP),  and  new  data  path  states  as  well  as  output  port  values  will 
be  created  using  functional  expressions.  It  is  appropriate  to  think  of  HOP  specifications 
as  specifying  deterministic  automatons  that  are:  (i)  enriched  to  include  information  about 
data  path  states  and  port  values;  (ii)  have  a  synchronization  semantics  underlying  event 
interactions;  (iii)  model  value  communications  as  updates  of  node  values  o-ver  a  strength 
lattice. 

The  paradigm  of  separation  of  data  from  control  is  not  forced  upon  the  designer.  It  may¬ 
be  judiciously  applied  when  found  useful.  It  is  also  possible  to  view  control  lines  as  data  and 
vice  versa  when  necessary,  in  a  structured  manner.  Event  to  data  mapping  is  achieved  by 
introducing  a  fictitious  module  that  awaits  the  event  and  generates  a  data  assertion.  Data 
to  event  mapping  is  achieved  by  defining  one  or  more  predicates  (as  needed)  over  the  data 
inputs,  and  defining  events  via  these  predicates. 

•  Compositionality:  HOP  provides  a  compositional  model  for  synchronous  hardware  systems 
as  revealed  by  figure  5.  If  we  have  two  HOP  specifications  Hi  and  H2  and  circuits  Ci  and 
C2  corresponding  to  them,  then  the  process  of  connecting  Ci  and  C2  to  obtain  C  can  be 
paralleled  in  the  HOP  domain  (essentially)  by  the  process  of  applying  PARCOMP  to  Hi  and 
H2  to  yield  H.  This  property  will  be  the  basis  for  establishing  the  correctness  of  systems, 
as  well  as  deducing  behavior  from  structure. 
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Figure  6:  Illustrating  Value  Communication  in  HOP 

•  Deducing  Beb&vior  from  Structure: 

Suppose  a  collection  of  data  path  modules  and  controllers  SAf,  are  connected  to  form 
a  system  M.  Could  a  behavioral  description  for  Af  as  a  black-box  module  be  deduced 
automatically?  That  is,  could  we  automatically  obtain  a  behavioral  description  that  is  simple 
to  understand  because  it  does  not  require  the  user  to  visualize  in  his  minds  all  possible  ways 
in  which  the  modules  SMi  could  interact? 

We  have  developed  an  algorithm  PARCOMP  to  do  exactly  this  for  HOP  specifications. 
Numerous  heuristics  render  PARCOMP  efficient  in  practice. 

•  Modeling  Value  Communication  Naturally  and  Modularly:  A  mechanism  called  data  ac¬ 
tions  is  used  to  model  data  transfers  over  data  ports.  This  mechanism  has  been  found  to  be 
more  natural  as  well  as  modular  to  use,  as  opposed  to  synchronous  value  communication. 
This  mechanism  ako  satisfactorily  models  the  ability  for  ports  (busses)  to  perform  broad¬ 
cast  as  well  as  bidirectional  communication.  We  explain  this  with  the  aid  of  a  synchronously 
clocked  hardware  system  depicted  in  figure  6. 

In  a  synchronous  hardware  system,  a  module  can  write  a  data  item  on  a  bus  for  one  or 
more  clock  ticks  even  in  the  absence  of  any  other  modules  simultaneously  reading  from  the 
bus.  Likewise,  a  read  can  go  on  without  any  simultaneous  writes.  Finally  there  are  situations 
such  as  shown  in  the  timing  diagram  in  figure  6.  In  these  situations  it  is  not  appropriate  to 
model  value  communication  through  synchronization  at  every  tick  of  the  interval.  Of  course 
one  may  still  model  these  situations  by  forcing  a  synchronizing  at  each  tick,  and  thereafter 
discarding  data  items  “when  not  needed”,  etc.  More  than  the  awkwardness,  this  approach 
suffers  from  a  lack  of  modularity  of  the  individual  specifications  because  the  specification 
writer  has  to  anticipate  this  particular  context  of  usage  of  the  consumer  module. 

Our  solution  involves  an  idea  borrowed  from  logical  variables  as  discussed  in  (and  sug¬ 
gested  by  the  author  of)  [Lin85].  It  also  relates  to  the  work  of  [Bry84]  and  [ISD88,  page 
307].  We  model  data  assertion  as  a  process  of  imparting  a  value  binding  to  a  logical  variable 
through  a  data  assertion.  These  value  bindings  last  only  for  the  duration  for  which  data 
assertion  lasts.  If  no  data  assertion  is  made,  the  logical  variable  is  essentially  unbound.  Data 
inputs  are  modeled  via  data  queries.  If  one  data  assertion  and  several  data  queries  are  made 
at  the  same  time,  the  queries  would  get  the  value  asserted  by  the  data  assertions.  Absence 
of  queries  or  assertions  does  not  cause  any  problems  in  our  approach. 

Multiple  writers  on  busses  are  modeled  as  a  process  of  imparting  two  value  bindings  to  a 
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logical  variable.  If  these  values  do  not  agree,  the  binding  associated  with  the  logical  variable 
is  error.  AgreeiaeDt  is  defined  through  a  bus  function  that  implements  a  monotonic  mapping 
over  a  strength  lattice  (similar  to  [Bry84]).  Such  a  strength  lattice  is  ddined  for  every  type  of 
value  that  can  be  communicated  over  ports.  For  the  bit  type  (extension  of  the  boolean  type) 
the  lattice  includes  the  strong  values  0,1  ,U,  and  bit-error,  where  U  stands  for  0  or  1,  but 
Unknown;  e.g.  the  state  of  an  un-initialized  flip-flop  is  a  U  bit.  The  only  weak  value — one 
that  can  be  dominated  by  the  strong  values — is  Z  which  stands  for  high-impedance.  We 
model  transistor  switches  as  devices  that  generate  a  Z  value  when  open.  Thus  when  module 
Afj  drives  a  bus  through  an  open  switch  and  M2  puts  a  1  on  the  bus  through  a  closed  switch, 
the  net  state  of  the  bus  would  be  determined  by  6us(Z,  1)  which  is  1.  Thus  the  bus  would 
be  bound  to  1.  In  HOP,  bidirectional  switches  are  modeled  as  devices  that  force  agreement 
between  two  logical  variables. 

A  good  way  to  look  at  value  communication  in  HOP  is  that  the  proper  synchronization 
of  events  guide  data  queries  and  assertions  into  a  correct,  implicit  synchronization. 

•  Modeling  “Arhythmic  Arrays”  A  majority  of  examples  published  in  the  area  of  specifica¬ 
tion  driven  design  are  architectures  consisting  of  dissimilar  modules  [Mos83,CGM86,Coh88, 
Hun87].  Regular  arrays  have  received  comparatively  lesser  attention  ([She84,She85],  [Pat85], 
[MH85],  [BW]).  Among  regular  arrays,  most  examples  have  involved  pipelined  or  systolic 
arrays.  Most  geometrically  regular  arrays  are  however  not  computationally  regular.  We  call 
such  arrays  arhythmic  arrays.’ Systolic  arrays  are  a  special  case  of  arhythmic  arrays  where 
both  the  geometry  and  the  computations  are  regular.  Some  examples  of  arhythmic  arrays 
are  registers,  random-access  and  content  addressable  memories,  FIFO  queues,  shift  registers, 
various  types  of  carry-chains,  and  the  LRU  matrix  discussed  in  section  A. 

Issues  in  the  specification  and  verification  of  arhythmic  arrays  are  different  from  those 
for  systolic  arrays.  Systolic  systems  typically  ^cct  data  transformations  on  streams  of  data, 
each  member  of  the  array  essentiaUy  invoking  the  same  operation  on  the  elements  of  the 
stream.  Members  of  arhythmic  arrays  support  multiple  modes  of  activity.  During  a  given 
time  interval,  different  members  of  an  arhythmic  array  are  involved  in  different  modes  of 
actiNity.  Also  in  arhythmic  arrays,  geometrical  issues  are  closely  coupled  with  behavioral 
issues. 

HOP  addresses  both  behavioral  and  geometric  issues  quite  effectively.  We  have  developed 
an  efficient  divide  and  conquer  technique  for  performing  PARCOMP  on  arhythmic  arrays. 
We  believe  that  HOP  could  be  effectively  used  for  specifying  systolic  systems  by  following 
the  approaches  taken  by  [She85]  or  [Hen84]. 

•  Abstraction  Mechanisms:  In  addition  to  behavioral  and  temporal  abstractions  of  HOP 
discussed  in  section  1.1,  HOP  supports  data  and  structural  abstractions  also^.  Structural 
abstraction  is  achieved  by  the  process  of  selectively  hiding  internal  connections  among  ports 
and  among  events.  Behavioral  abstraction  is  achieved  by  the  introduction  of  processes  and 
mathematical  fimctions  that  model  the  actual  behavior.  Data  abstraction  is  the  use  of  a 
variety  of  user-defined  data  types  to  model  state  and  port  values.  Two  varieties  of  data  types 
are  supported  in  HOP:  (i)  equationally  defined  abstract  data  types,  similar  to  [GHM78]; 
(ii)  data  types  defined  via  abstract  models,  similar  to  [LS75]. 

There  are  two  approaches  to  specifying  external  timing  requirements:  (i)  specify  the  most 

Mn  the  same  “hearty  spirit”  as  the  word  “systolic”! 

discussion  of  these  four  abstraction  mechanisms  appears  in  [BP86,  Chapter  9]. 
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general  temporal  behavior  admissible;  (ii)  specify  concrete  bounds  on  the  timing  of  various 
modes  of  activity.  We  follow  the  former  approach  in  this  paper. 

In  order  to  amplify  this  pc^t,  consider  synchronous  hardware  systems  as  an  example.  In 
writing  the  requirements  specification  of  synchronous  systems  in  HOP  we  actually  pretend 
that  they  behave  similar  to  self-timed  hardware  with  handshaking  events.  These  handshaking 
events  are  merely  conceptual  in  nature.  When  the  actual  system  gets  designed,  the  designer 
ipves  definitions  for  these  conceptual  events.  For  example  for  the  push  operation  on  a  stack, 
we  will  associate  a  davail  event  to  ‘‘notify*'  the  stack  when  the  data  to  be  pushed  on  it  is 
available.  In  the  actual  implementation,  we  will  not  have  this  handshake  line,  but  instead 
a  discipline  for  using  the  push  operation,  such  as;  “the  user  is  expected  to  supply  the  data 
exactly  one  tick  after  applying  push.*  (This  example  also  points  out  that  it  seems  attractive 
to  define  HOP’s  events  in  temporal  logic.) 

This  approach  to  timing  has  two  advantages: 

•  Implicit  (hard-wired)  timing  assumptions  in  synchronous  hardware  are  made  explicit; 
We  believe  that  hard-wired  synchronization  assumptions  are  even  worse  than  hard¬ 
wired  constants  in  programs,  and  are  a  source  of  conunon  sequencing  errors.  Syn¬ 
chronous  hardware  designs  where  synchronization  is  hard-wired  are  difficult  to  modify 
and  reuse. 

•  If  the  conceptual  events  are  actually  implemented  as  signals,  we  get  a  self-timed  imple¬ 
mentation  of  the  system.  Thus  a  common  specification  serves  both  synchronous  and 
asynchronous  implementations. 

1.3  Comparbon  With  Related  Research 

We  compare  HOP  with  Circai(Mil85a,Mil83],  [GorSl],  Johnson[Joh84},  HOL[CGM86],  and 
SBL[GSS87},  using  FF  as  an  example. 

1.5.1  Circal 

The  examples  reported  to  date  suggest  that  Circal  attempts  to  model  systems  with  con¬ 
siderably  more  detail  than  we  care  to  do  in  HOP.  For  instance,  FF  would  be  modeled  b}’ 
modeling  wires,  inverters,  and  pass  transistors  as  having  some  propagation  delay.  Though 
published  examples  in  Circal  have  not  emphasized  the  identification  of  useful  events  and 
modes  of  behavior,  in  principle  this  is  possible  to  do. 

Circal  and  HOP  share  the  conunon  feature  of  taking  a  process  oriented  view.  However 
a  crucial  difference  exists  in  the  way  value  conununication  is  performed.  In  Circal,  data 
communication  between  nx>dule8  over  a  port  that  can  carry  data  items  of  type  T  is  modeled 
as  synchronous  conununication  over  a  sort  of  labels  li  where  t  ranges  over  the  value  sort  of 
T.  In  our  experience,  HOP'S  approach  is  more  convenient  to  use  for  large  architectures  that 
are  specified  at  the  system  docking  levd. 

1.3.2  Next-state  and  Output  Function  Based  Approaches 

[Gor81]  and  Johnson[Joh84]  correspond  to  a  modeling  style  where  the  next  state  and  current 
outputs  are  functionally  determined  by  the  current  state  and  current  inputs.  HOP's  process 


diagrams  can  be  made  to  correspond  to  this  model  simply  by  using  only  one  control  state 
always,  and  modding  the  rest  of  the  state  of  the  system  as  data  path  states.  A  process 
diagram  for  FF  corresponding  to  this  view  is  shown  in  figure  7.  In  this  approach,  we  explicitly 
model  both  the  bits  stored  inside  the  FF.  We  then  define  next  data  path  states  for  the  inputs 
Id,  cp,  and  cr. 

However  notice  that  this  diagram  does  not  prevent  the  sequence  ‘Id,  cr'  from  being 
applied.  This  is  not  a  useful  mode  of  use  of  FF.  Besides  this  approach  is  of  lower  level 
because  it  requires  both  the  storage  nodes  to  be  made  explicit  whereas  ideally  the  data  path 
state  must  only  be  an  abstract  model  of  the  state  of  the  system.  We  have  noticed  both  these 
problems  in  the  approaches  taken  by  [GorSl]  and  Johnson[Joh84]. 

Finally,  the  next-state  and  output  function  based  approaches  do  not  support  the  notion 
of  ‘s}mchronization  failure'.  We  believe  that  the  static  checks  instituted  by  HOP  based  on 
event  sequences  is  a  form  of  “temporal  type  checking"  that  is  promising  in  the  early  detection 
of  sequencing  errors.  It  forces  designers  to  state  their  sequencing  assumptions  and  supports 
the  checking  of  these  assumptions  for  them. 

1.3.3  HOL 

The  style  in  which  HOL  specifications  have  been  presented  in  publications  so  far  does  not 
match  hardware  designer's  intuitions  very  well  in  one  regard:  instead  of  talking  about  the 
internal  states  of  modules,  HOL  introduces  a  higher  order  relation  to  model  relations  between 
port  signals. 

We  believe  that  internal  states  are  a  very  intuitive  “reality”  in  hardware.  Besides,  states 
are  nothing  but  equivalent  classes  of  I/O  histories;  thus  there  is  an  inherent  notational 
economy  in  a  state  based  representation. 

In  contrast  to  HOP,  the  style  of  temporal  abstraction  followed  in  HOL  [CGM86,  Mul¬ 
tiplier  Example]  is  to  introduce  an  existential  quantifier  that  says:  “there  exists  a  future 
time  where  the  action  in  question  happens".  This  approach  is  less  operational  (hence  less 
intuitive  for  practicing  hardware  designers).  It  also  does  not  introduce  events  that  corre¬ 
spond  to  points  in  time  where  some  crucial  interactions  between  modules  take  place.  The 
introduction  of  such  events  in  HOP  makes  specifications  more  readable  and  more  amenable 
to  static  analysis.  It  also  supports  self-thued  implementations  directly. 

1.3.4  SBL 

HOP  evolved  out  of  SBL  [GSS87,Gop86].  SBL  modeled  hardware  systems  as  abstract  data 
types  with  a  set  of  external  operations  corresponding  to  state  changing  (constructor)  and 
port- value  producing  (observer)  operations.  These  operations  have  associated  timing  charac¬ 
teristics.  HOP  was  created  to  overcome  certain  restrictions  in  SBL's  ability  to  model  complex 
timings.  Also  HOP  treats  controllers  as  well  as  data  path  elements  without  distinction  as 
processes;  SBL  was  organized  based  on  a  centralized  controller  discipline. 

As  in  SBL,  in  HOP  internal  states  of  modules  as  well  as  values  communicated  over 
ports  are  modeled  using  abstract  data  types,  A  fairly  rich  type  system  exists  in  HOP.  HOP 
combines  the  best  of  process  based  models  and  abstract  data  type  based  models. 

The  purely  algebraic  approach  based  on  SBL  is  still  being  pursued  by  the  second  and 
third  authors  of  [GSS87],  and  SBL  has  independently  matured  considerably  since  the  time 
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HOP  was  created. 


1.4  Organization  of  the  Paper 

Section  2  presents  our  design  methodology  and  the  highlights  of  the  language.  The  opera¬ 
tional  semantics  of  HOP  is  also  presented  here.  Section  3  presents  the  specification  of  two 
venions  of  a  simple  stack.  Section  4  considen  the  fornoal  verification  of  these  two  versions 
of  the  stack.  Section  5  presents  two  versions  of  the  PARCOMP  algorithm.  The  current 
implementation  as  well  as  future  directions  of  research  are  presented  in  section  6. 


2  Terminology  and  Operational  Semantics 

2.1  Terminology 

There  are  three  kinds  of  HOP  specifications;  absproc,  rcAlproc,  and  vecproc. 

An  absproc  specifies  a  module  as  a  black-box.  It  specifies  the  interface  of  the  module, 
consisting  of  data  ports,  a  set  of  events,  and  a  protocol  specification.  A  vecproc  (a  special 
case  of  realproc)  is  tailored  for  specifying  arhythmic  arrays. 

A  realproc  specifies  the  realization  of  a  module  as  a  heterogeneous  collection  of  sub- 
modules.  In  this  paper  we  do  not  consider  the  process  of  picking  the  realization  that  best 
suits  the  problem  in  hand;  we  only  address  modeling  of  the  selected  realization.  A  realproc 
is  a  three-tuple  <  Si,  C,E  >  where  Si  is  a  collection  of  module  specifications  (absproc  or 
realproc),  C  is  an  interconnection,  and  E  is  an  export-list. 

An  interconnection  is  the  union  of  data  interconnections  and  event  interconnections. 
A  data  interconnection  is  a  binary  relation  over  data  ports,  and  indicates  the  ports  that 
are  connected.  An  event  interconnection  is  a  binary  relation  over  events,  and  indicates 
those  events  that  are  forced  to  occur  at  the  same  time  (by  tying  control  wires  together,  for 
example).  An  export-list  is  the  union  of  data  export  list  and  event  export  list.  Data  export 
lists  and  event  export  lists  are  subsets  of  data  ports  and  events  (respectively),  and  indicate 
those  ports  and  events  of  the  submodules  that  are  part  of  the  interface  of  the  realproc. 

Primitive  modules  are  modules  whose  design  refinement  in  HOP  is  of  no  interest.  Thus 
only  an  absproc  of  primitive  modules  is  of  interest.  For  every  primitive  module  Af,  a  behav- 
iorally  identical  circuit  C  is  assumed  to  be  available. 

The  HOP  design  methodology  for  designing  a  module  M  takes  one  of  the  following 
approaches  (recursively  defined): 

Top-downi  An  absproc  specification  for  M  (Map)  is  written,  followed  by  a  realproc  Mrp. 
PARCOMP  is  used  to  infer  an  absproc  specification  for  M.  The  inferred  absproc 
for  M  is  called  Mapi.  The  behavior  observable  at  the  interface  of  Map  and  Map,  are 
then  compared  for  agreement.  At  present  this  is  supported  by  a  manual  verification 
methodology.  If  an  agreement  exists,  the  designer  then  proceeds  to  apply  the  HOP 
design  methodology  to  the  submodules  of  M. 

Top-down2  The  designer  does  not  write  Map,  but  begins  by  writing  Mrp-  Mapi  is  inferred 
using  PARCOMP,  and  then  Mapi  is  studied  either  manually,  through  simulation,  or 
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through  formal  verification  to  confirm  that  it  has  the  desired  behavior.  The  HOP 
design  methodology  is  then  applied  to  the  submodules  of  M. 

Bottom-up  A  partiai  Mrp  is  written;  specifically,  its  submodules  are  selected  but  the  con¬ 
nections  are  not  determined.  The  HOP  design  methodology  is  then  applied  to  these 
submodules.  Mrp  is  completed  by  providing  the  interconnection  list  and  the  export-list. 
Mapi  is  then  inferred  from  Mrp  using  PARCOMP,  and  is  then  examined  for  correctness. 

2.2  An  Operational  Semantics  for  HOP 

This  section  is  organized  as  follows.  We  begin  by  defining  action,  the  basic  unit  of  commu¬ 
nication  activity  that  a  process  may  engage  in.  Actions  are  either  events  or  data  actions. 
Both  events  and  data  actions  are  further  sub-classified.  The  domain  of  actions  for  a  process 
is  act.  A  set  of  simultaneous  actions  is  known  as  a  compound  action. 

We  then  define  reduction  rules  for  compound  actions.  These  reduction  rules  are  based  on 
the  notion  of  action  product,  as  in  {Mil82].  Our  action  product  operator  is  the  infix  operator 
We  then  define  a  process  as  a  system  that  engages  in  a  compound  action  ca  at  the  current 
time  and  transforms  itself  into  a  new  process  that  begins  its  activity  at  the  following  time 
step. 

Thus  the  meaning  of  a  HOP  process  is  its  transition  relation  •^=  Proc  x  act  x  Proc  which 
is  defined  via  structural  induction  over  the  abstract  syntax  of  HOP.  The  definition  of  ^  is 
the  operational  semantics  of  HOP.  We  will  define  new  HOP  processes  from  existing  ones  by 
using  the  notation  where  ante  is  an  already  defined  HOP  process  (the  “antecedent”), 
and  const  (the  “consequent”)  introduces  the  next  syntactic  category  of  processes  that  has 
not  been  defined  so  far. 

2.2.1  Actions,  and  Action  Product 

Events  in  HOP  consist  of  input  events  written  as  e,  output  events  written  as  ?,  and  synchro¬ 
nized  events  written  as  f. 

An  input  event  e  represents  a  logical  condition  that  is  awaited  (at  some  time)  by  a 
module.  An  output  event  ?  represents  the  satisfaction  of  a  logical  condition  at  a  particular 
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time  instant.  The  notion  of  synchronued  events  ?  was  introduced  in  HOP  to  impart  a 
broadcast  semantics  to  output  events.  Let  us  examine  synchronized  events  in  detail. 

A  synchronized  event  7  represents  three  facts:  (i)  at  the  time  7  is  generated,  an  output 
event  ?  has  synchronized  with  one  or  more  input  events  e;  (ii)  because  an  7  has  a  broadcast 
semantics,  7  also  has  a  broadcast  semantics;  therefore  ?  as  well  as  7  look  the  same  as  far  as 
an  input  event  e  is  considered;  (iii)  7  and  7  are  treated  difftrcntly  by  the  ‘hiding'  operator 
of  HOP:  hiding  7  results  in  an  toTe  transition  (similar  to  the  r  of  [Mil82]);  however  hiding 
7  causes  the  synchronization  tree  to  be  pruned.  This  is  because  7  represents  a  mode  of 
behavior  that  will  be  selected  because  of  synchronizations,  whereas  7  represents  a  mode  of 
behavior  that  will  not  be  selected  because  it  has  not  synchronized  so  far,  and  it  is  going  to  be 
bidden.  We  find  the  usage  of  7  to  be  more  convenient  than  the  mechanism  of  ^-conjunction 
proposed  in  [Mil82,  page  32]  to  model  broadcast. 

Data  actions  have  only  one  simplification  rule  defined  for  them  by  action  product:  when 
two  different  data  assertions  Ip  —  ^  and  !p  s  £,  are  made,  the  resultant  value  on  the  port  !p 
is  defined  by  the  function  hus(£i,  £3).  A  complete  definition  of  the  action  product  operator 
is  given  in  figure  8. 

2.2.2  Definition  of  the  Transition  Relation  ^ 

In  this  section,  we  define  the  transition  relation  by  structural  induction.  Before  these  defini¬ 
tions  are  applied  to  a  realproc  or  a  vecproc,  all  the  port  and  event  names  in  their  submodules 
are  assumed  to  be  renamed  so  as  to  be  distinct.  Also,  every  compound  action  used  in  a  def¬ 
inition  is  assumed  to  be  irreducible  under  the  action  product  operator 

Process  STOP 

STOP  is  the  simplest  of  HOP  processes.  It  has  a  null  transition  relation;  i.e.  it  always 
remains  halted. 

A  finite  process  is  defined  to  be  one  that  will  become  STOP  in  a  finite  number  of  steps.  A 
finite  process  does  not  usually  represent  any  practically  useful  hardware  system.  Therefore 
if  PARCOMP  results  in  a  finite  process  starting  from  non-finite  processes,  there  is  room 
for  suspicion  that  there  are  synchronization  errors  in  the  system.  This  is  how  we  detect 
sequencing  errors  statically  during  PARCOMP:  the  reason  for  giving  rise  to  a  finite  process 
can  usually  be  pinpointed  as  a  collection  of  unsynchronized  events  that  are  bidden. 

Sequential  Processes 

Action:  (ca  — ♦  P)  P 

If  P  is  a  process,  co  P  is  a  process  that  first  performs  the  compound  action  ca  and 
then  behaves  like  P.  Since  actions  are  performed  through  mutual  cooperation,  the  correct 
way  to  look  at  the  process  P  =  e  — »  P'  is  that  P  has  the  potential  to  poform  e  and  continue 
to  behave  line  P'.  If  co  involves  no  events  at  all,  the  process  can  always  make  progress. 

Vacuous  compound  actions  are  flagged  by  a  single  output  event  tdle.  Thus  a  process 
idle  -*  P  performs  an  idling  step  and  continues  to  behave  like  P.  7  is  an  identity  element 
of  the  action  product  operator  Most  commonly,  idle  is  introduced  in  a  specification  as  a 
result  of  hiding  a  synchronized  event  7. 
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Sequential  Processes  are  a  special  case  of  deterministic  choices  where  there  is  exactly  one 
choice  available. 


Deterministic  Choice 
Det-cboice:  (|i  ooi-  — »  Pi)  Pi 

The  next  cat^ory  oi  HOP  processes  coxtsidered  is  the  deterministic  choice.  A  process 
P  ctti  Pi,  where  i  ranges  over  an  index  set  /  is  one  that  offers  a  deterministic  choice 
consisting  of  the  compound  actions  coi  during  its  first  computational  step.  If  choice  c^  is 
accepted,  P  continues  to  behave  like  Pj^.  Example:  The  FF  module  of  section  1  offers  a 
deterministic  choice  of  the  events  Id  and  cr  during  the  first  time  step. 

If  I  has  more  than  one  element,  then: 

1.  There  mast  be  an  inpat  event  Ci  present  in  each  eoi.  Since  the  CiS  govern  the  selection 
of  one  of  the  alternatives  of  the  choices,  the  tiS  must  be  pairwise  mutually  exclusive. 
Since  input  events  are  boolean  expressions,  two  events  Ci  and  Cj  are  mutually  exclusive 
if  their  conjunction  is  equivalent  to  false.  This  fact  is  almost  always  decidable  in 
practice  because  events  are  usually  defined  as  boolean  expressions.  However,  HOP 
does  allow  events  of  a  more  general  nature  to  be  defined,  using  user-defined  predicates 
belonging  to  a  Turing-complete  language.  In  such  cases,  well-formedness  checks  for 
mutual  exclusion  of  events  cannot  always  be  carried  out.  In  practice  this  situation  is 
not  expected  to  arise  frequently. 

2.  Data  queries  may  appear  in  an  unrestricted  manner  among  the  cOi  and  they  do  not 
govern  the  choice.  This  is  only  to  enfore  a  discipline  on  the  use  of  the  choice  construct. 
It  is  still  possible  to  implement  choices  based  on  current  “data  inputs",  by  defining 
events  that  correspond  to  these  data  inputs — such  as  ?port  s  55. 

A  deterministic  choice  process,  such  as  P  (above),  can  be  depicted  as  a  tree  where  the  root 
node  of  the  tree  corresponds  to  P,  and  there  are  arcs  labeled  with  ca,  leading  from  the  root 
nodes  of  P  to  the  root  nodes  of  P,.  Process  diagrams  are  a  finite  representation  of  these 
trees.  These  trees  are  structurally  similar  to  the  synchronization  trees  of  [Mil82].  However, 
note  again  the  absence  of  nondeterminism  in  HOP. 


Adding  Actions  To  Initials 


If  P  is  a  process,  cal,P  is  a  process  which  adds  col  to  the  initials  of  P.  Further,  col,P 
must  obey  the  restrictions  defined  for  deterministic  choices  (mutually  exclusive  guards,  and 
the  same  data  assertions  in  all  the  branches): 


Add-to- initials: 


P^P' 

^ipo^p. 


Hiding 

“Hiding  an  event  e"  is  a  shorthand  for  saying  that  e,  ?,  and  f  are  all  hidden  from  a  process. 
In  the  rule  Hiding-sync,  we  are  considering  the  hiding  of  f.  Since  ?  represents  an  event 


I 
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resulting  from  a  synchronization,  hiding  ?  is  considered  safe;  we  merely  replace  ?  by  idle. 
This  model  the  ability  to  drive  several  control  inputs  from  one  source  and  internalizing  the 
cttnncctiona; _ 


Hidmg-sync 


Hide  e in  P 


Hide  e  in  P* 


The  notation  “[new/old]’’  is  used  to  mean  that  “new”  replaces  “old”. 

Hiding  e  or  7  from  a  process  prevents  it  from  synchronizing  on  these  symbols.  This  can 
be  captured  by  pruning  those  branches  of  the  synchronization  tree  that  are  labeled  by  e  or 
E - 


Hiding-unaync 


•1*  p*,  p  p", 


e  or  7  €  col 


(Hide  e  in  P)  (Hide  e  inP") 


Hiding  a  data  output  port  removes  data  assertions  made  on  that  port  from  the  current 
compound-action  of  the  process.  This  would  affect  those  processes  that  perform  a  data  query 
from  a  connected  port  at  the  same  time: 


ea.’f 


HJdiag-dout 


P' 


Hide  p  in  P  — ►  Hide  p  in  P' 


Hiding  a  data  input  port  causes  those  variables  that  would  have  been  bound  by  a  data 
query  on  this  port  to  remain  unbound: _ 


Hiding-din 


P' 


Hide  p  in  P  Hide  p  in  P'wiih  x  fret  in  P' 


Renaming 

Processes  are  made  to  interact  with  each  other  either  via  events  or  via  data  actions  on  ports, 
by  renaming  those  events  and  ports  to  common  names;^ 


p  - 

■^P' 

Rename  e  to  el  in  P  - 

Rename  c  to  cl  inP' 

P  - 

Up' 

Rename  e  to  el  in  P  - 

^  Rename  e  to  el  inP' 

P  — 

-»  P’,  da  uses  p 

Rename  p  to  pi  in 

P  Rename  p  to  pi  inP' 

Parallel  Composition 

The  parallel  composition  operator  ||  models  the  process  of  realizing  a  system  by  putting 
together  several  sub-processes,  and  permitting  their  interaction  through  events  and  ports 
that  are  connected.  In  HOP,  ports  having  the  same  name  $ans  the  ?  and  !  symbols  are 
connected.  Likewise,  events  having  the  same  name  sans  the  overbar  are  connected: 

^Renaming  events  as  weU  as  ports  implies  tbe  appropriate  use  of  connections  as  well  as  “glue  logic”  in 
the  underlying  hardware 


In  the  above  definition,  we  assume  that  the  mutually  exclusive  nature  of  the  choices 
offered  by  one  process  is  not  disrupted  by  the  other  process.  E.g.  if  P  offers  ei  and  £2,  then 
we  assume  that  Q  does  not  generate  ‘zr,72*.  A  syntactic  definition  of  the  Parcomp  rule  that 
strengthens  the  precondition  to  this  effect  is  more  involved  and  hence  not  shown  here. 

After  performing  parallel  composition  according  to  the  above  rule,  we  may  simplify  the 
result  by  using  the  following  rule  (if  applicable).  This  rule  captures  the  effect  of  value 


Value  Communication  During  Parallel  Composition 

p  (**?sh^£),eo  p* 

P  P'  [P/x] 

Conditionals 

HOP  processes  are  usually  defined  as  process  schemas  Pfdps],  where  for  each  value  of  dps 
we  have  one  specific  process,  dps  usually  represents  the  data  path  state  of  the  process.  We 
have  the  notion  of  conditional  processes  in  HOP  that  allows  us  to  specify  the  behavior  of  a 
process  based  on  its  dps  variable.  Thus  we  may  define  a  process  P  as: 

P[dps]  «=  if  p{dps)  then  Pl[f{dps)]  else  P2\g{dps)]. 

After  reducing  the  predicate  application  J^dps)  to  true  or  false^  one  of  the  following  rules 

would  apply: _ 

PI  _£!♦  p'  p2  -£!♦  P' 

Coudhional - s - -  ;  - — - 

(if  true  then  Pi  else  P2)  P  (if  false  then  PI  else  P2)  — ►  P' 


Recursion 


A  collection  of  one  or  more  processes  may  be  defined  recursively.  The  following  rule  (adapted 
jom,  and  explained  in  [Mil82])  applies: 


Recursion 


Piifix  X.P/X]  P' 


In  this  paper,  it  suffices  to  view  recursion  as  iteration. 


Indefinite  Delay 

T*  e  phrase  e**  stands  for:  "Delay  indefinitely  until  e  occurs.”  Its  definition  is  as  follows: 

Pi  «=  cal c,  ca2 -♦  Pi 

is  equivalent  to 


PI 

Qi 


ccl  Q\ 
not(e)  Ql 
I  e,cc2  — ^  Pi 


2.3  An  Assessment  of  the  Merits  of  the  HOP  Model 

Although  a  formal  definition  of  the  underlying  semantic  model  of  a  language  is  always  desir¬ 
able,  the  practical  utility  of  the  model  has  to  be  separatively  established.  We  now  offer  our 
own  supportive  comments  as  well  as  existence  proofs  for  the  merits  of  HOP's  model.  Our 
approach  is  partly  motivated  by  that  taken  in  [Mil82]. 

2.3.1  IVoni  Intuitions  to  Operational  Semantics 

Since  an  operational  senoantics  is  a  compact  embodiment  of  intuitions,  it  is  prone  to  dis¬ 
agreement.  Consider  for  instance  the  d^nition  of  action  product.  It  may  be  argued  that 
it  is  acceptable  to  consider  7,7  as  both  7  as  well  as  error  based  on  whether  driving  a  con¬ 
trol  wire  with  a  1  firom  two  sources  is  normal  or  an  error  (due  to  the  danger  of  skews,  for 
example).  Our  decision  in  this  regard  is  to  treat  7,7  as  7,  but  issue  a  warning  in  the  actual 
implementation. 

Yet  another  place  where  the  HOP  model  can  fail  its  users  if  used  improperly  is  related 
to  set-up  and  hold  times.  Consider  a  read/write  memory.  Throughout  a  write  cycle  of  the 
memory,  the  address  input  must  remain  stable,  lest  an  unintended  location  get  written  into. 
The  data  input  is  allowed  to  change  within  the  write  cycle  so  long  as  it  stabilizes  a  fixed 
duration  before  the  end  of  the  write  cycle.  The  idealized  timing  model  taken  in  HOP  (or  for 
that  matter  in  [GorSl],  [CGM86],  and  [Job84])  does  not  make  a  distinction  between  these 
temporal  types.  Hence  it  is  possible  to  prove  a  design  to  be  correct  without  implying  the 
correctness  of  the  corresponding  circuit. 

Our  solution  is  as  follows.  We  borrow  ideas  from  past  researchers  who  have  attempted 
to  classify  signals  into  different  temporal  types,  such  as  T  (stable  throughout),  E  (stable 
towards  the  end),  etc.  [Noi82,Kar84l.  We  would  take  a  staged  approach  where  a  HOP 
verification  would  be  followed  by  a  circuit-theoretic  reasoning  based  on  temporal  types. 
This  would  provide  more  reliable  validation  in  addition  to  partitioning  concerns  (verification 
in  an  idealized  model  is  separated  from  checking  for  proper  set-up/hold  times). 

2.3.2  FVom  An  Operational  Semantics  to  a  Denotational  Semantics 

Just  as  trace  sets  are  denotations  of  deterministic  CSP  processes  [Hoa85,  Chapter-2],  Trace- 
Nodebinding  sets  (TN  sets)  are  the  denotation  of  HOP  processes.  Traces  have  the  same 
meaning  as  in  [Hoa85],  and  nodebindings  capture  the  effect  of  value  assertions  on  nodes. 

If  P  is  a  process,  we  define  a  meantng  function  M  such  that 


—  {<  fa* ^2  >»  •••)• 


A  TN  pair  <  t,,  Oi  >  is  mM(V)  if  and  only  if  process  P  can  perform  a  sequence  of  compound 
events  t^  while  generating  a  sequence  of  node  value  bindings,  o,.  Both  t,  and  Oi  are  prefix- 
closed  sets.  The  TN  set  is  obtainable  from  the  transition  relation 

Alternatively,  it  is  possible  to  assign  a  Kahn  semantics  [Kab74]  to  HOP.  Events  will  then 
be  regarded  as  bit-streams,  and  data  I/O  as  general  streams.  A  state-stream  that  feeds  back 
into  the  module  will  capture  the  updating  of  the  data  path  state. 


n 


I 
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Figure  9:  Either  a  Shift  Register  or  a  Ring  Oscillator! 


2.3.3  Identification  of  Equational  Laws 

Due  to  the  absence  of  nondeterminism  in  HOP,  its  equational  laws  are  simple  in  nature 
(such  as:  the  action  product  operator  is  commutative  and  associative;  the  ||  operator  is 
commutative  and  associative,  etc.). 

Even  in  the  absence  of  nondeterminism,  we  have  identified  several  useful  and  non-trivial 
notions  of  equivalence  between  HOP  processes.  For  example,  the  following  questions  arise 
during  the  process  of  verification  and  optimization  (say,  via  pipelining): 

•  how  do  we  relate  requirements  specifications  to  design  specifications  in  the  process  of 
verification? 

•  in  what  sense  is  a  pipelined  system  comparable  to  its  non-pipelined  counterpart? 

In  both  the  above  cases,  the  identity  relation  between  TN  sets  is  far  too  strong  to  be  useful. 
We  address  these  questions  in  section  4. 

2.3.4  Handling  Examples  That  Go  Beyond  the  Limits  of  the  Model 

Consider  the  circuit  in  figure  9.  If  we  admit  an  event  corresponding  to  the  closing  of  all  the 
three  switches,  the  node  values  attained  in  this  circuit  would  be  equal  to  the  solution  of  the 
equation  x  =  nof(x),  i.e.  the  undefined  element  BIT-ERROR.  All  operators  are  strict  on 
BIT-ERROR,  as  BIT-ERROR  is  the  top  element  of  the  BIT  value  lattice. 

In  short: 

s  we  cannot  rule  out  any  circuits  based  on  their  structure; 

s  though  circuits  could  become  unusable  with  respect  to  those  ports  that  generate  BIT¬ 
ERROR,  they  could  well  be  used  with  respect  to  other  ports. 

2.3.5  Relationship  to  Automata-theoretic  Models 

HOP  specifications  that  put  a  finite  bound  on  the  data  path  state  and  port  value  types'  value 
sorts  can  be  regarded  as  a  collection  of  communicating  Me&ly  Machines  [HU79];  a  process 


?din 


Idout 


Ir«s«t,  Ipush,  lop,  Itop 
Isdof,  Idataj&vail 

Ofraa,  Otopjavail 


Figure  10:  Schematic  of  the  absproc  of  a  Stack 

turning  into  STOP  will  then  correspond  to  an  error-transition  made  by  an  automaton.  In 
HOP,  we  prevent  the  state  explosion  that  can  result  in  modeling  systems  such  as  memory 
array's  using  Mealy  machines,  by  factoring  system  states  into  data  and  control  states.  For 
a  memory  module,  the  number  of  control  states  are  independent  of  the  memory  size.  The 
number  of  data  path  states  do  not  concern  us  as  data  path  states  are  modeled  using  ab¬ 
stract  data  types,  and  data  path  state  updates  and  observations  are  modeled  in  a  first-order 
functional  programming  language[Hen80]. 

2.3.6  Supportive  Examples 

Our  primary  source  of  confidence  in  HOP  derives  from  the  success  we  have  had  in  specifying 
a  wide  spectrum  of  digital  systems,  both  small  and  large  as  well  as  complex.  Due  to  the 
shortage  of  space,  we  exanune  only  relatively  simple  examples  in  this  paper. 

3  Specifications  of  a  Stack 

3.1  Absproc  of  an  Unbounded  Stack 
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0tkabs  [dps]  <■ 

Ir«>«t  '’>  Ofr««  '>  stkabsCrMstCdps)] 

I  Ipush  *>  Idata_avail,  vdin  ■  ?di&  “>  Ofraa  ->  atkabsCpuahCdps.vdin)] 

I  Ipop  '>  Ofraa  ->  stkabaCpopCdps)] 

I  Itop  *’>  Otop.avail,  !dout«top(dps)  '>  Ofraa  *>  stkabaCdps] 

I  Isdaf  *>  Ofraa  ->  stkabaCdps] 

Figure  11:  RequiremeutB  Specification  for  a  Stack 

As  defined  in  section  2,  the  interface  of  a  process  consists  of  a  set  of  data  ports,  a  set  of 
events  and  a  protocol  specification.  Figures  10  and  11  specify  the  interface  of  an  unbounded 
stack,  stkabs.  The  stack  operates  as  follows.  It  supports  the  operations  reset,  jmsk,  pop, 
top,  and  sdef  (similar  to  “no  op” ).  The  first  three  modify  the  state  of  the  stack  in  an  obvious 
way.  After  a  top  operation,  the  current  top  of  stack  is  made  available  on  the  output  port 
Idout.  sdef  corresponds  to  a  “no  op”. 

We  will  now  write  a  requirements  specification  for  the  stack.  In  this  specification,  we 
model  the  stack’s  data-path  state  via  the  stack  abstract  data  type.  We  specify  the  timing 
of  the  stack  in  a  manner  that  is  unconunitied  to  any  particular  clocking  discipline.  This 
requires  that  we  adopt  a  few  conventions  so  that  when  a  design  specification  of  the  stack  is 
available,  it  becomes  possible  to  relate  it  to  the  requirements  specification: 

•  Generate  an  output  event  along  with  every  data  output.  This  event  announces  the 
availability  of  the  data  output.  For  a  set  of  simultaneous  data  assertions,  we  need 
introduce  only  one  such  output  event. 

•  Introduce  an  input  event  along  with  every  set  of  simultaneous  data  queries.  This  input 
event  signifies  data  availability. 

•  Do  not  insist  on  any  specific  delays  between  events. 

•  Notify  the  completion  of  an  “operation”  by  a  “free”  event. 

The  specification  in  figure  11  follows  these  conventions.  For  ease  of  typing,  we  denote  an 
input  event  e  using  le,  an  output  event  7  using  Os,  and  a  synchronized  event  ?  using  Se. 

Let  us  examine  the  push  operation.  It  is  selected  by  applying  the  corresponding  output 
event  Opush  that  matches  the  event  Zpush  offered  by  the  stack,  stkabs  then  waits  indefi¬ 
nitely  for  event  Idata.avail.  When  Idata_svail  is  asserted  by  a  module  M,  a  module  A' 
(possibly  the  same  as  Af)  is  expected  to  bind  the  port  ?din  with  the  value  to  be  pushed. 
Therefore  vdin  gets  bound  to  this  value.  Thereafter,  the  stkabs  process  performs  internal 
activities  that  last  an  unspecified  amount  of  time.  These  activities  stop  when  event  Ofree 
occurs.  One  step  after  Ofrss  occurs,  stkabs  goes  back  to  its  top-level  control  state,  ready 
to  accept  the  next  command  from  the  external  world.  In  going  to  the  top-level  control 
state,  the  data  path  state  is  changed  to  pusb(dps,vdin).  This  is  signified  by  the  expression 
stkabs [push (dps , vdin) ] . 

Now  consider  the  top  operation.  Once  triggered,  stkabs  goes  into  a  period  of  internal 
activity  that  is  terminated  by  the  occurrence  of  the  event  Otop.avail.  When  this  happens 
the  value  binding  on  the  output  port  fdout  is  the  “top  of  stack”,  top  (dps).  After  some  more 
unspecified  delay  and  one  step  after  event  Ofree  occurs,  the  stack  returns  to  its  top-level 
control  state  to  accept  its  next  command. 


23 


IrasAt ,  Ipush ,  Ipop ,  Itop ,  Isdaf .  Idataoivail  Of  ra* ,  Otopwavall 

Figure  12:  Schematic  of  the  Realproc  of  a  Stack 

3.2  Realproc  of  the  Unbounded  Stack 

The  schematic  in  figure  12  is  intended  to  implement  the  stack.  The  Stack  Realproc 
stkreal  is  made  up  of  three  modules  CTR,  MEM,  and  SCTL.  Here,  process  MEM  is  defined 
mutually  recursively  with  another  process  MEMl,  We  assume  that  in  this  design  all  the 
modules  share  a  global  clock. 

The  realization  uses  a  memory  and  a  counter  to  (respectively)  hold  the  stack  locations 
and  the  stack  pointer.  A  controUer  decodes  the  external  commands  and  appropriately  se¬ 
quences  the  submodules.  This  design  implements  an  unbounded  stack.  Operation  push 
is  implemented  by  incrementing  the  counter  and  writing  into  the  location  of  the  memory 
pointed  by  the  counter,  pop  is  implemented  by  decrementing  the  counter,  top  is  implemented 
by  reading  the  memory  at  the  location  pointed  by  the  counter.  Finally,  sdef  is  implemented 
by  doing  nothing.  Suitable  control  wire  encodings  trigger  these  operations. 

The  Submodule  Behaviors 

We  first  examine  the  details  of  the  write  and  read  operation  of  the  MEM  submodule.  V^  rile 
is  invoked  by  event  Zvrite.  At  this  time,  the  address  and  data  are  to  be  held  stable  on  the 
ports  ?cdo  and  ?din  respectively.  One  tick  later  MEM  returns  to  its  control  state  with  data 
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An  up/dovn  Counter 


cm  [c>]  <B  Icdaf,  !cdo«cs  ->  cm  [cs] 

I  Hoad,  vdinaTcdi  *>  Cm  Cvdis] 

I  lup,  lcdo«ca  ->  cm  CupCca)] 

I  Idovn,  !cdo«ca  *■>  Cm  CdoonCca)] 

— - X  raad/vrita  Kaaory - - - 

MEM  [ms]  <-  Imdef  ->  MEM  [ms] 

I  Iwrito,  va»?cdo,  vd«?din  ->  MEM  [mrite(ms,va,vd)] 
I  Iraad,  va»?cdo  ->  MEMl[ms,  va] 

MEMl  [ms,oa]  <■ 

Imdef.  !dout*read(ffls,oa) 

I  Ivrite,  na*?cdo,  vd>?din, 

! dout«read ( ms . oa) 

I  Iread,  na«?cdo, !dout«read(ms.oa) 

- -  X  stack  controller 


SCTL  <-  Isdef,  Omdef,  Ocdef  ->  SCTL 

I  Ireset,  Omdef,  Ocdef  ->  Oload,  Omdef  >>  SCTL 

I  Ipush,  Omdef,  Ocdef  ->  Oup,  Omdef  ->  Oirrite,  Ocdef  ->  SCTL 

i  Ipop,  Omdef,  Ocdef  >>  Odovn,  Omdef  ->  SCTL 

I  Itop,  Omdef,  Ocdef  ->  Ocdef,  Oread  >>  Oidle  '•>  SCTL 


- - - —  Stack  Realization - 

stkreal  [cs,bs]  <■  Hide  {load, up,do«B,cdef, read, write, mdef.cdo}  in 
cm  [cs]  II  HEM  [ms]  II  SCTL 


->  MEM  [ms] 

->  MEM  [write (ms, na.vd)] 
->  MEMl [ms,  na] 


Figure  13:  Specifications  of  the  Submodules  of  the  stack 


PCTL  <«  Oad«f,  Ocdsf  ->  PCTL 

I  Ir«8at,  Oad«f,  Ocd«f  •>  Olo&d,  0Bd«f  ->  PCTL 

I  Iposh.  OBd«f,  Ocd«f  >>  Oup,  OBd«f  *>  writ*, 

i  Ipop,  OBd«f,  Ocdaf  ->  Odom.  Oadof  *>  PCTL 

I  Itop,  Oadof.  Ocdaf  ■>>  Ocd«f.  Oroad  *>  Oidla  ->  PCTL 


PCTL 


Figure  14:  The  Specification  of  a  Pipdined  Stack  ControUer 
path  state  vriteCu.ea.vd). 

The  implementation  of  the  read  operation  is  trickier.  We  want  to  exploit  a  degree  of 
pipelining  afforded  by  the  presence  of  the  memory  data  register  in  this  MEM.  Specifically, 
consider  two  read  operations  issued  one  after  the  other.  While  we  are  collecting  the  results  of 
the  first  read,  we  wish  to  start  the  second  read.  (Without  this,  read  will  cost  us  two  cycles.) 
So  starting  from  MEM  in  data  path  state  ms,  when  Iraad  is  invoked  with  address  input 
?cdo  held  stable,  the  MEM  process  turns  itself  into  the  MEMl  process.  The  MEMl  process 
awaits  the  operations  read,  write  and  mdef  wbik  outputting  the  result  of  the  previous  read 
on  port  Idout.  While  reads  keep  coming,  we  stay  in  state  MEMl.  When  something  other 
than  read  comes,  we  go  back  to  state  MEM. 

Implementation  of  push 

Now  consider  how  push  is  implemented  by  focusing  on  SCTL  and  seeing  what  it  does  on 
the  data  path  modules.  When  SCTL  decodes  the  Ipuah  command,  it  outputs  Ondef  and 
Ocdef  to  the  memory  and  the  counter,  thereby  keeping  both  MEM  and  CTR  inactive. 
In  the  next  step  it  outputs  Oup  and  Ondef  thereby  incrementing  the  counter,  keeping  the 
memory  unchanged.  In  the  next  step  it  outputs  Ovrite  and  Ocdef  thereby  writing  into  the 
memory,  at  the  address  pointed  to  by  the  iacremented  counter  value  the  data  item  that  is 
now  asserted  at  the  input  ?din.  SCTL  then  goes  back  to  its  top-level  control  state.  All  the 
other  operations  are  implemented  similarly. 


3.3  Pipelining  the  Stack 

After  obtaining  a  realization,  the  designer  is  usually  interested  in  optimizing  the  design 
either  manually  or  automatically.  Pipelining,  or  overlapping  the  internal  activities  within 
a  system,  is  a  frequently  adopted  optimization.  (There  are  other  optimizations  such  as  the 
sharing  of  ALUs,  busses,  etc.;  we  do  not  consider  these  in  this  paper.)  Once  manually 
pipelined,  it  is  necessary  to  validate  the  functional  correctness  of  the  system. 

It  turns  out  that  we  can  pipeline  stkreal  to  a  large  extent.  We  illustrate  this  by  "pipelin¬ 
ing  the  push  operation”,  i.e.  carrying  over  some  of  the  computation  associated  with  push  into 
the  following  operation.  To  see  how  this  can  be  done,  consider  how  push  was  implemented 
by  SCTL.  At  first,  SCTL  performed  the  up  operation  on  the  counter.  It  then  performed 
the  write  operation  on  the  memory  and  only  then  did  it  return  to  its  top-level  control  state. 
However  SCTL  could  have  been  waiting  for  the  next  operation  while  the  write  operation 
was  still  in  progress  internally.  (As  we  show  in  section  4,  this  wasted  period  does  show  up 
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M  an  extra  idling  step  in  the  bdiavior  deduced  by  PARCOMP.)  The  pipdined  controller 
PCTL  gjlven  in  figure  14  achieves  this  degree  of  pipdining.  If  it  is  used  in  lieu  of  SCTL,  we 
get  a  realization  called  petkreal.  By  introducing  PCTL  in  lieu  of  SCTL  we  increase  the 
number  of  control  states  in  the  controller  in  return  for  the  increased  speed  of  operation.  Let 
us  examine  how  petkreal  performs  push. 

petkreal  first  decodes  the  command  Ipneh.  Then  it  increments  the  counter  via  Oup. 
It  then  turns  itself  into  the  process  «rite,PClL.  This  is  a  controller  gimiler  to  PCTL  with 
the  difference  that  wbik  waiting  for  the  next  operation  it  keeps  MEM  busy  internally  by 
applying  the  Ovrite  event  on  it. 

3.4  Questions  to  be  Addressed 

Having  specified  etkabs,  etkreal  and  petkreal,  the  foUowing  questions  arise  naturally; 

•  Deducing  Behavior  horn  Structure:  Can  we  automatically  infer  a  single  process  equiv¬ 
alent  to  etkreal  as  well  as  petkreal?  What  are  some  applications  of  this  PARallel 
COM  Position  algorithm? 

s  A  Verihcation  Problem:  Is  etkreal  a  correct  design  corresponding  to  the  requirements 
expressed  in  etkabs?  How  do  we  determine  this? 
a  Given  the  controller  in  figure  14  and  given  the  claim  that  this  controller  correctly 
pipelines  the  stack,  how  do  we  verify  this  claim?  In  what  sense(s)  is(are)  a  pipelined 
system  comparable  to  a  non-pipelined  system? 
s  Specification  Directed  Testing:  How  do  HOP  specifications  help  in  testing  systems? 

•  What  is  a  system  design  methodology  using  HOP? 

These  questions  are  answered  in  the  following  sections. 

4  Verification  Criteria,  and  Illustration  Thereof 

The  denotation  of  a  HOP  process  is  its  trace-nodebinding  (TN)  set.  Therefore  two  HOP 
processes  are  equivalent  if  they  have  identical  TN  sets.  However  in  many  practical  situations, 
two  processes  that  are  equivalent  in  many  useful  senses  do  not  have  identical  TN  sets.  As 
we  will  soon  show,  a  pipelined  hardware  design  can  contain  different  traces  than  present  in 
either  the  requirements  specification  or  a  non-pipelined  design. 

In  this  section  we  address  the  verification  of  HOP  processes.  We  illustrate  it  on  the 
non-pipelined  realization  of  the  stack.  Then  we  consider  the  problems  that  arise  in  the 
verification  of  pipelined  hardware,  and  suggest  possible  solutions. 

4.1  The  Non-pipelined  Stack  Realization 

We  consider  the  stack  realization  stkreal  of  figure  13.  The  behavior  of  stkreal  with 
respect  to  its  external  ports  and  external  events  can  be  inferred  using  PARCOMP.  The  details 
of  this  procedure  will  be  provided  in  section  5.  The  inferred  process,  STKPAR,  is  shown  in 
figure  1 5.  Also  shown  in  this  figure  for  easy  reference  are:  (a)  PSTKPAR,  the  inferred  behavior 
of  the  pipelined  stack;  (b)  stkabs,  the  requirements  specification  of  the  stack. 
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•tkabs  [dpa]  <■ 

Iraaat  '>  Ofra*  ->  >tkabaCr«s«t(dpa)3 
I  Ipuah  '>  ldata_aTall,  vdia  •  Tdin  '>  Ofra*  ->  stkabaCpuahCdpa.Tdin)] 

I  Ip^  '>  Ofraa  ->  atkaba[pop(dpa)3 

I  ItQp  '>  Otop.avail,  IdoutBtopCdpa)  '>  Ofraa  ->  atkabaCdpa]  j 

I  Zadaf  *’>  Ofraa  ->  atkaba[dpa] 

X - 

STKPAR  Cca.H]  <- 

Iraaat  ->  Oidla,  Ofraa  ->  STKPAIi  [0,Ba] 

I  Ipuah  •>  Oidla  •>  Idata.aTail.  adiBB?din,  Ofraa 

*>  STKPAR  Cup(ca).  vritaCaa.upCca) .vdin)]  ^ 

I  Zpop  ->  Oidla,  Ofraa  *>  STKPAR  Cdoan(ca),  aa] 

I  Itop  ->  Oidla  >>  Otop.arall,  Ofraa,  !dout>raad(a»,ca)  ->  STKPAR  Cca,ffls] 

I  ladaf,  Ofraa  ->  STKPAR  Cca,Ba3 


PSTKPAR  Cca,Ba]  <- 

Iraaat  •>  Oidla,  Ofraa  ->  PSTKPAR  [O.aa] 

I  Ipuah  ->  Oidla,  Ofraa  ->  PSTKPARl  C  ttp(ca),  aa  ] 
t  Ipop  •>  Oidla,  Ofraa  ->  PSTKPAR  Cdo«n(ca),  aa] 

I  Itop  ->  Oidla  >>  Otop.avail,  Ofraa,  !doutBraad(aa,ca) 

•>  PSTKPAR  Cca.aa] 

I  ladaf,  Ofraa  ->  PSTKPAR  Cca,aa] 

PSTKPARl  [cal,  aal]  <- 

Iraaat,  Idata. avail,  vdin*?din  *■>  Oidla,  Ofraa 
->  PSTKPAR  [0,  vrita(aal.cal.vdin)] 

I  Ipuah,  Idata_avail,  Tdin«?din  '>  Oidla,  Ofraa 

->  PSTKPARl  Cup(cal),  vrita(aal,cal,vdin)] 
I  Ipop,  Idata_avail,  vdin^Tdin  ->  Oidla,  Ofraa 

*■>  PSTKPAR  Cdo«n(cal),  vrita(aal,cal,vdin)] 

I  Itop,  Idata.avail,  vdin«?dln 
>>  Oidla,  Ofraa 

>>  !dout«raad(writa(aal,cal),cal},  Otop.avail,  Ofraa 
■*>  PSTKPAR  [cal,  «rita(aal,cal,vdin)] 

I  ladaf,  Idata_aTail,  vdin^Tdin  ->  Oidla,  Ofraa 
->  PSTKPAR  [cal,  vrita(Bal,cai,Tdin)] 


Figure  15:  stkabs,  STKPAR  and  PSTKPAR 
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Let  U8  examine  the  push  operation  of  STXPAR.  It  is  initiated  by  Ipush.  After  this,  during 
the  second  time  step,  the  user  sees  STKPAR  idling.  Actually  at  this  time  the  counter  module 
CTR  is  getting  incremented  by  one,  but  the  ‘up*  event  on  CTK  is  hidden  and  hence  not  visible 
outside.  During  the  third  time  step,  write  is  performed  on  MEM.  As  a  consequence,  port  ?din 
is  being  sampled  now.  Starting  from  the  fourth  tick,  STKPAR  continues  to  behave  as  before, 
with  its  data  path  state  advancing  to  the  pair  of  states  <  ttp(cs),  wriie{ma,up{cs),vd)  >. 
The  events  Idata.avail  and  Ofree  are  added  in  by  the  user  as  will  be  explained  shortly. 

Now  consider  operation  top.  During  the  second  time  step,  a  ‘read’  is  issued  on  MEM,  but 
it  appears  as  Oidle  because  ‘read*  is  hidden.  The  result  of  top  becomes  available  during  the 
third  time  step. 

Adding-in  Conceptual  Events  to  an  Deferred  Absproc 

A  design  is  a  more  detailed  implementation  that  has  to  relate  to  its  requirements  specifi¬ 
cation.  Different  designs  however  have  different  usage  protocols.  The  protocols  are  to  be 
made  explicit  before  we  can  meaningfully  compare  a  design  with  a  requirements  specifica¬ 
tion.  In  our  approach  this  is  done  by  adding-in  events  that  characterize  the  protocol  obeyed 
by  the  absproc  description  at  the  appropriate  instants  of  the  inferred  absproc.  In  the  STKPAR 
specification  in  figure  15,  the  events  Idata. avail,  Ofree,  and  Otop.avail  are  added  in  by 
the  user  for  this  purpose.  For  example,  in  the  push  operation  of  STKPAR,  Idata_avail  is 
generated  during  the  third  time  instant. 


4.2  Our  Verification  Technique 

Currently  our  formal  verification  methodology  applies  to  those  systems  whose  operations 
(modes  of  behavior)  can  be  viewed  as  a  “constructor-observer**  (cjo)  experiment.  A  process 
P(S]  subject  to  a  cjo  experiment  consumes  a  sequence  of  input  values  produces  a  sequence 
of  output  values  Oj  and  turns  into  a  process  P[Cons(S,  /,)].  It  is  assumed  that  the  output  Oj 
can  be  modeled  using  an  observer  function  application  of  the  form  Obs{S,Ii).  Likewise,  it  is 
assumed  that  data  path  state  changes  can  be  captured  by  a  constructor  function  application 
of  the  form  Cons{S,  li).  constructor  experiments  and  observer  experiments  are  special  cases 
of  c_o  experiments  where  either  a  new  data  path  state  or  a  new  output  port  value  gets  created 
(but  not  both),  cjo  experiments  start  with  dbtinct  command  events,  (such  as  Ipush). 

A  large  number  of  digital  systems  can  be  viewed  in  this  manner.  For  instance,  in  the 
proof  of  correctness  of  a  simple  microprocessor  reported  in  [Coh88],  only  constructor  and 
observer  experiments  are  considered.  This  is  also  the  case  in  the  proof  of  correctness  of 
yet  another  microprocessor  in  [Hun87].  In  both  these  works  the  constructor  experiments 
correspond  to  the  system  state  changes  caused  by  the  execution  of  instructions,  and  an 
observer  experiment  consists  of  the  observation  of  the  register  values  attained  in-between 
constructor  experiments.  In  both  these  works,  the  proof  involves  showing  that  if  the  systems 
specified  at  the  requirements  and  the  design  levels  start  in  two  observationally  equivalent 
states  S  and  S  and  are  subject  to  the  same  macro-instruction,  the  systems  wind  up  in 
two  states  51  and  51  that  are  also  observationally  equivalent.  Our  technique  is  related  to 
these  works.  Due  to  the  usage  of  a  process  model  in  HOP,  we  do  not  have  to  introduce 
“oracles”  [Hun87]  to  capture  external  input  that  may  arrive  at  unspecified  time  instants. 
hop’s  “busy  wait”  (”>  Idata.avail, vdin:«? din)  captures  this  effect.  If  necessary,  we  can 
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introduce  “tester  processes”  to  model  the  world  actem&l  to  a  HOP  process  in  as  much  detail 
as  we  wish. 


4.3  An  Outline  of  the  Verification  of  STKPAR 

Let  us  first  consider  a  system  with  only  constructor  and  observer  experiments — in  this  case, 
the  stack.  We  pick  a  constructor  experiment  ce«  and  subject  stkabs  and  STKPAR  to  it.  We 
then  follow  oe,-  with  an  observer  experiment  oej.  We  then  assert  that  the  values  returned  by 
the  latter  must  agree,  thus  obtaining  a  Verification  Condition  (VC).  In  this  way,  we  consider 
all  possible  sequences  of  constructor  experiments  and  observer  experiments. 

Though  this  may  sound  like  performing  an  infinite  number  of  proofs,  we  can  achieve  the 
same  effect  by  performing  a  finite  number  of  proofs,  using  a  form  of  data  type  induction 
as  applied  to  HOP  processes.  In  HOP,  data  path  states  are  modded  using  constructor 
functions  that  are  defined  equaiioneily  using  data  type  axioms.  Similarly,  port  values  are 
modeled  using  observers  which  are  also  defined  via  data  type  axionu.  We  wiU  take  advantage 
of  these  facts  in  our  verification  methodology. 

We  assume  that  processes  stkabs  and  STKPAR,  when  started  in  data  path  states  dps  and 
<  cs,ms  >  are  observutionally  equivalent.  By  this,  we  mean  that  there  exists  no  observer 
experiment  that  can  distinguish  between  these  two  processes.  (In  our  example,  performing 
top  results  in  the  same  value  being  produced  on  the  port  Idout.)  We  then  show  that 
performing  the  same  constructor  experiment  ce,  on  stiba6s[dp5]  and  ST  KP  A  files,  ms]  results 
in  processes  st^a6s[dps']  and  STKPAfilcs',  ms']  that  are  also  observaiionalJy  equivalent. 

Consider  the  top  and  push  operations.  Applying  our  methodology  results  in  the  following 
VC: 

_ iop(dps)  =  read(ms,cs) _ 

iop{push{dps,vdin))  =  read{write{ms,up{cs),vdin),up{cs)y 
The  antecedent  part  is  obtained  from  the  assumption  that  the  processes  start  by  being 
observationally  equivalent-meaning  that  top  must  not  distinguish  them.  The  consequent  is 
obtained  by  applying  a  top  first  followed  by  a  push. 

This  VC  can  be  shown  to  be  valid  using  the  equational  properties  of  the  stack,  memory 
and  counter  abstract  data  types.  A  typical  proof  would  consist  of  unfolding  the  consequent 
and  performing  a  case  analysis,  where  the  cases  considered  follow  from  the  antecedent.  In 
the  current  example,  familiar  stack  and  memory  axioms  allow  us  to  immediately  reduce  the 
consequent  to  the  tautology  vdin  =  vdin.  We  then  repeat  the  procedure  for  all  the  other 
constructors  pop,  reset,  and  sdef. 

We  handle  systems  with  cj>  experiments  as  follows.  For  a  cj>  experiment  op,  we  first 
identify  its  associated  constructor  and  obsover  functions  and  0„^.  When  op  is  used  as 
the  final  experiment  in  a  sequence  of  experiments,  we  would  get  VCs  of  the  following  form: 

Oop.o4*(C'oy.o4i(dps))  =  ^op.rea((C’op.reoi(^^'^)) 

0®j,.«S»(Cop.ai»(.D«S*(dps)))  =  Oop.r9»l{Cop.Ttal{Drtal{DPS))) 

In  this  VC,  the  subscript  .abs  denotes  “as  defined  at  the  requirements  level”.  The  subscript 
.real  is  meant  to  denote  “the  expression  obtained  from  the  realproc  level”.  P  is  a  constructor 
different  from  C,  and  results  from  either  a  constructor  experiment  or  a  cjo  experiment. 
dps  and  DPS  are  the  data  path  states  at  the  requirements  and  design  specification  levels 
respectively. 
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Our  assumption  that  the  modes  of  behavior  can  be  thought  of  as  cjo  experiments  imparts 
a  structure  to  TN  sets  such  that  every  trace  and  node-binding  sequence  is  a  concatenation 
of  the  traces  and  node-bindings  of  the  individual  cjo  experiments.  The  equivalence  induced 
on  TN  sets  by  our  verification  approach  is  captured  by  the  following  properties: 

a  STKPAR  and  stkabs  have  the  same  trace  sets  if  all  intervening  Oidle  events  are  dropped, 
a  For  every  data  assertion  in  stkabs,  STKPAR  also  makes  those  assertions.  (The  opposite 
need  not  be  true.) 

a  For  every  data  query  made  in  stkabs,  STKPAR  sJso  makes  the  same  query,  with  the 
same  data  input  provided  in  both  cases. 

4.4  Verification  of  PSTKPAR 

Similar  to  STKPAR,  we  obtain  PSTKPAR  through  PARCOMP.  The  specification  is  in 
figure  15.  It  includes  the  events  Idata_avail,  Ofraa,  and  Otop.avail  added  in  by  the 
designer,  for  reasons  to  be  explained. 

Consider  the  implementation  of  push.  In  stkabs,  it  consists  of  the  sequence 

Ipush  *>  Idata.avail,  vdin:s?din  ->  Ofraa  *>  stkabs [pushCdps.vdin)] 


whereas  in  PSTKPAR,  it  consists  of  the  sequence 

Ipush  *>  Ofraa  *>  PSTKPARl[up(cs) ,ns] 

followed  by  the  sequence  generated  while  executing  PSTKPARl.  Regardless  of  the  choice 
offered  to  PSTKPARl,  it  awaits  Idata.avail  at  the  first  step. 

Observe  that  the  ordering  of  events  Ipush  *>  Idatajavail  *>  Ofraa  changes  to  Ipush 

Ofraa  •>  Idata^vail  in  PSTKPAR.  PSTKPAR  issues  Ofraa  one  step  earlier  than  STKPAR 
to  permit  the  next  operation  to  begin.  Thus  the  traces  of  PSTKPAR  and  stkabs  are  not  the 
same. 

To  complicate  things  further,  PSTKPAR  supports  the  same  set  of  experiments,  but  in  a 
different  way.  Consider  the  sequence  of  operations  Ipush, Itop  and  consider  how  PSTKPAR 
of  figure  15  would  execute  it.  Two  ticks  after  Ipush,  we  end  up  in  control  state  PSTKPARl 
where  the  choice  Itop  is  accepted.  Strictly  speaking,  a  “lop  experiment”  begins  at  control 
state  PSTKPARl — Aotoever  while  this  experiment  goes  on,  the  state  of  MEM  is  getting  updated, 
with  the  result  that  top  doesn’t  appear  to  be  an  observer  experiment. 

Our  solution  to  these  problems  is  based  on  the  following  assumptions: 

•  Only  the  sequential  ordering  of  the  command  events  present  in  the  trace  sets  need 
agree.  Other  events  that  are  related  to  data  availability  and  modules  becoming  free 
can  be  ignored  while  comparing  trace  sets. 

s  Assume  that  every  pipelined  system  has  a  “nop”  event  that  can  be  inserted  in  between 
two  pipelined  operations  to  get  rid  of  the  effect  of  pipelining.  It  is  possible  to  artificially 
introduce  such  an  operation  if  it  does  not  exist.  For  the  stack,  sdef  is  such  an  operation 
that  already  exists. 
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Consider  the  only  pipelined  operation  push  of  the  stack.  (All  such  pipelined  operations 
are  considered  in  generid.)  To  establish  whether  push  has  been  correctly  implemented,  we 
should  try  to  establish  the  following  equivalences  between  sequences  of  operations:  For  all 
OP  other  than  push: 

•  push',  sdtf  ',  OP  s  push',  OP',  sdef 

•  push',  sdej',  push',  OP  =  push',  push',  OP;  sdef 

The  first  line  says  that  doing  the  sequence  of  operations  shown  on  the  left  or  on  the  right 
on  the  very-same  pipelined  system  should  be  treated  as  being  equivalent  by  all  ensuing 
operations.  The  rationale  behind  choosing  the  first  sequence  is  the  following.  The  left- 
hand  side  of  the  first  sequence  pads  a  sdef  in  between  push  and  OP,  thus  studying  the 
implementation  of  push  without  pipelining.  The  right-hand  side  checks  the  effect  of  push 
when  part  of  the  computation  of  push  is  allowed  to  overlap  into  OP. 

Section  Summary 

We  believe  that  the  use  of  abstract  data  types  as  well  as  the  absence  of  nondeterminism 
contributes  to  the  simplicity  of  the  verification  of  HOP  specifications.  Our  approach  to 
verification  has  similarities  to  that  reported  in  |Coh88],  [Hun87]  and  [GSS87].  None  of  these 
works  consider  pipelined  modes  of  behavior. 

Pipelined  modes  of  behavior  are  very  conunonly  employed  in  high-performance  micropro¬ 
cessors.  Often  pipelining  is  not  fully  automated,  and  so  results  in  the  introduction  of  many 
sequencing  errors  in  the  initial  designs  [Wei87].  Since  formal  verification  is  not  employed 
in  practice,  there  is  a  great  danger  that  some  sequencing  errors  remain  even  in  fabricated 
chips  despite  extensive  simulations.  To  our  knowledge  no  one  has  considered  the  verification 
of  pipelined  hardware  systems,  except  in  limited  domains  such  as  systolic  systems  [Hen84]. 
This,  in  our  opinion,  is  an  important  area  of  new  research. 


5  The  Basic  PARCOMP,  and  PARCOMP-DC 

The  operational  rules  of  HOP  permit  us  to  simplify  the  definition  of  a  collection  of  process 
Pi  involving  the  ||,  Renaming  and  Hiding  operators  into  the  definition  of  a  single  process 
P'  where  (i)  P'  does  not  contain  any  occurrences  of  the  Renaming  or  Hiding  operators; 
(ii)  P'  has  the  ||  operators  pushed  *deep  inside  it'*;  (iii)  P'  does  not  have  any  data  queries  or 
assertions  in  its  body;  instead,  for  every  data  query /assertion  pair  <dq,da>  present  in  P, 
P‘  has  a  functional  expression  in  its  body.  Further,  the  collection  P,-  and  P'  have  identical 
TN  sets  with  respects  to  their  external  ports — a  fact  we  have  already  exploited  during  formal 
verification. 

This  procedure  called  PARCOMP  is  well  defined  t.e.  effective.  It  always  terminates 
because: 

•  The  operational  rules  always  effect  simplification  under  a  well-founded  ordering;  this 
is  true  because  each  operational  rule  considered  considers  a  process  P  of  the  form 
ca  P',  and  the  rule  is  then  recursively  applied  to  P'; 
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•  When  we  encounter  a  form  P[ap]  ||  —  ||  (where  sp  and  sq  are  terms)  for  the 

first  time,  instead  of  unraveling  this  form  through  the  rules  of  HOP,  we  unravel  a  more 
general  form  P[x]  || ...  ||  Q[y]  where  x,y  are  variables; 

•  A  collection  P,-  introduces  a  finite  number  of  processes  through  mutual  recursion.  As 

the  {|  operator  is  “moved  inwards”,  eventually  there  will  come  a  stage  where  we  will 
le-enoounter  expressions  of  the  form  P[sp]  ||  ...  ||  Regardless  of  what  sp  and  sq 

hie,  when  we  rencounter  a  form,  we  need  not  re-explore  the  form  P[sp]  ||  ...  ||  Q[s9] 
because  a  most  general  expansion  for  tins  form  has  already  been  obtuned  for  the  case 
where  sp  and  sq  are  variables. 

Hence  PARCOMP  is  an  algorithm. 

In  order  to  determine  how  efficient  and  useful  PARCOMP  is  in  practice,  we  coded  it  and 
tried  it  out  on  a  number  of  examples.  We  conclude  that:  (i)  it  is  efficient  enough  to  be  used 
for  many  purposes;  (ii)  the  single  process  inferred  by  PARCOMP  is  attractive  in  many  ways: 

•  It  is  easier  to  understand  because  the  internal  details  are  hidden; 

•  It  is  and  more  efficient  to  simulate  because  internal  value  communications  appear  as 
functional  expressions  in  the  inferred  specification;  therefore  we  need  not  maintain 
information  regarding  port  value  bindings  during  simulation; 

•  Synchronization  errors  can  be  detected  because  synchronization  errors  usually  result 
in  PARCOMP  generating  a  finite  process; 

•  Only  the  useful  modes  of  behavior  are  retained.  In  particular,  the  behavioral  descrip¬ 
tion  of  all  the  “idle  hardware”  is  not  retained.  Idle  hardware  includes  both  unused  and 
under-utilized  hardware — t.e.  modules  with  only  part  of  their  operations  used. 

5.1  Steps  in  the  Basic  PARCOMP  Algorithm 

We  consider  a  collection  of  processes  that  do  not  contain  any  Cond-processes  (a  more  general 
definition  of  PARCOMP  is  given  in  Appendix  C).  All  required  renamings  are  assumed  to 
be  already  done.  We  are  given  the  process  descriptions  of  the  processes  to  be  composed,  as 
well  as  the  biding  set  HS  containing  events  and  ports  that  are  hidden.  Then,  the  steps  in 
PARCOMP  for  this  problem  are: 

1.  Start  all  the  N  processes  in  their  respective  starting  states. 

2.  March  the  processes  in  unison  (lockstep-synchronously)  until  all  JV-tuples  of  control 
states  that  are  equidistant  from  the  starting  state  have  been  visited.  (A  node  is  at 
distance  d  from  the  start  state  if  it  can  be  reached  via  d  transitions  from  the  start 
state.)  Record  each  such  AT-tuple  of  control  states  visited  as  a  control  state  of  the 
inferred  process. 

3.  In  moving  from  the  control  state  AT- tuple  Sg  to  the  control  state  jV-tuple  Sy,  the 
following  actions  are  taken: 

(a)  Collect  the  actions  labeling  the  transitions  going  from  the  tth  component  of  5, 
to  the  tth  component  of  S„,  for  all  i  in  N.  Call  this  collection 
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(b)  Reduce  Cgy  by  applying  the  action  product  operator  to  its  members.  Obtain 
the  normal  form  D^y  such  that  any  pair  of  elements  in  Dgy  is  irreducible  under 

tt  91 
9  • 

(c)  If  there  are  unsynchronized  actions  in  that  are  in  the  hiding  set  HS,  then 
mark  state  Sy  as  un-reachable  from  5.. 

(d)  Replace  the  synchronized  events  of  Dgy  that  are  in  HS  by  tilt. 

(e)  Collect  the  result  of  value  communications  as  value  bindings  to  the  variables  in 
the  data  query: 

i.  Represent  these  bindings  as  lei  blocks  corresponding  to  the  state  Sy. 

ii.  For  multiple  writers  on  a  node,  apply  the  frus  function  to  the  data  assertions 
to  determine  the  resultant  value  on  the  node. 

iii.  Since  the  user  is  required  to  indicate  each  tristatable  port  in  his  system,  bus 
connections  of  non-tristate  ports  can  be  caught. 

(f)  Recursively  call  PARCOMP  starting  at  Sy. 

4.  In  the  end,  remove  states  that  are  unreachable  from  anywhere. 

5.  If  the  resulting  process  is  a  finite  proess,  then  discover  where  it  becomes  STOP,  and 
for  all  those  points  fiag  a  sequencing  error,  and  generate  diagnostic  information. 

5.1.1  Illustration  of  PARCOMP  on  the  push  Operation  of  stkreal 

Consider  the  controller  SCTL,  counter  CTR  and  naemory  MEM  of  figure  13  to  be  in  their  starting 
control  states.  Let  us  march  them  around  from  control  state  <  SCTL,  CrR[cs],  MEAf[m$]  > 
back  to  <  SCTL,CTR[cs],MEM[ms]  >. 

1.  The  first  group  of  actions  to  be  considered  are 

Ipusb,  Omdaf,  Qcdef,  ladaf,  Icdaf.  This  simplifies  to 
Ipush,  Sndef,  Sedef,  and  after  applying  hiding,  becomes 
Ipush  . 

2.  The  second  group  of  actions  are 

Oup,  Oadef,  lup,  !cdo  •  ca,  ladef 

which  after  simplification  and  applying  hiding  becomes 

Oidla. 

3.  The  third  group  of  actions  are 

Ovrita,  Oedaf,  Ivrita,  va^Tcdo,  vd*?din,  Icdaf,  'cdoBca 

and  after  simplification  and  hiding  becomes 

vd*?dia. 

4.  We  then  move  back  to  <  SCTL,CTR[up{cs)],  M EM[write{ms,up{cs),vd)]  >. 

5.  There  are  no  other  paths  that  survive  for  the  push  operation.  For  instance,  the  tran¬ 
sition  labeled  by 


Ipuah,  Qcdef,  lup,  Oadef,  Ivrite, .. 


Th«  pairs  So,ri  and  5), To  ne«d  not  be  considered. 


Figure  16:  Illustrating  Marching  in  Unison 

is  not  a  feasible  transition  because  while  the  stack  controller  is  trying  to  apply  the 
Ocdef  and  Omdef  events  to  the  counter  and  memory  respectively,  the  counter  and 
memor>’  themselves  are  awaiting  lup  and  Ivrite.  Unsynchronizcd  as  well  as  hidden 
transitions  get  pruned. 

5.1.2  Heuristics  Employed  by  PARCOMP 

•  Generating  the  A^-tuples  of  states  by  taking  a  cross-product  is  wasteful,  as  figure  16 
shows.  We  do  it  by  marching  in  unison;  its  results  are  also  shown  in  figure  16. 

•  A  hidden  unsynchronized  event  labeling  a  transition  essentially  removes  the  transi¬ 
tion  from  the  graph.  Using  this  mechanism,  many  control  state  A^-tuples  become 
un-reachable  and  are  removed  early. 

5.1.3  Applications  of  PARCOMP 

•  Obtaining  simpler  behaviors  during  design  entry  through  a  schematic  entry  system. 
This  way  after  entering  a  schematic,  the  behavior  can  be  inferred,  validated,  and  made 
a  part  of  the  module  library  for  further  upwards  compositicr.. 

•  Simulation  can  be  easily  achieved  by  introducing  a  tester  process  similar  to  that  used 
by  (Mil85b].  The  tester  process  is  composed  with  the  system  to  be  tested  and  the 
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Each  celt  is  ‘M’ 


I 


Figure  17:  Divide  and  Conquer  PARCOMP 

resultant  process  can  then  be  run. 
e  Symbolic  execution  of  the  inferred  behavior  is  possible, 
a  The  inferred  behavior  can  be  used  for  formal  verification. 

5.2  A  Divide  and  Conquer  Version  of  PARCOMP 

PARCOMP -DC  is  a  potentially  faster  way  of  computing  PARCOMP  by  exploiting  the  ge¬ 
ometrical  regularity  of  arhythmic  arrays.  We  will  take  a  generic  arhythmic  array  structure 
and  illustrate  PARCOMP-DC  on  it.  Due  to  the  shortage  of  space,  we  relegate  a  more  inter¬ 
esting  example — ^the  LRU  matrix — to  Appendix  A.  In  this  section  we  derive  an  expression 
for  the  computational  savings  possible  due  to  PARCOMP-DC. 

Consider  the  array  A  shown  in  figure  17.  It  consists  of  a  collection  of  modules  Af  con¬ 
nected  in  a  regular  interconnection  pattern.  For  simplicity  assume  a  nearest-neighbor  con¬ 
nection  that  is  regular  in  both  the  dimensions. 

Consider  the  problem  of  computing  PARCOMP(A)\  t.e.  the  composition  of  all  the  A/s 
constituting  A.  PARCOMP  is  both  commutative  and  associative.  Hence,  we  can  split  A 
into  two  halves,  say  Aj  standing  for  “the  top  of  A”  and  As,  standing  for  “the  bottom  of 
A\  Thus, 

PARCOMPiA)  =  PARCOMPi  PARCOMP{At),  PARCOMP{Ab)  )• 
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But  PARCOM P{Ab)  i*  easily  obtained  from  PARCOM P{At)  by  renaming  the  ports  of 
At  to  the  corresponding  ports  of  Ab-  Thus  we  need  compute  only  PARCOM P(At)  using 
the  PARCOM P  procedure;  we  can  then  obtain  PARCOM P{Ab)  by  making  a  copy  of  the 
data  structure  that  represents  PARCOM P{At),  end  apply  suitable  renamings  to  it. 

But  the  process  need  not  stop  at  the  top-level  of  division.  We  can  split  At  into  Atl  (fbe 
“left  half  of  At)  and  Atr  (the  “right  half  of  At)  and  again  exploit  the  fact  that  Atr  can 
be  obtained  from  Atl  through  copying  and  renaming.  This  gives  us  a  divide-and-conquer 
procedure.  We  depict  the  execution  of  this  procedure  as  a  tree  in  figtue  17. 

PARCOMP-DC  is  often  more  efficient  than  PARCOMP.  Let  us  make  an  approximate 
cost  analysis.  The  worst-case  time  complexity  of  PARCOMP  is  primarily  dependent  on  the 
number  of  control  states  that  we  have  in  a  process  diagram.  Specifically,  it  can  be  equal 
to  the  cross-product  of  the  number  of  control  states  in  each  of  the  processes.  Suppose  for 
simplicity  that  array  A  is  square,  and  has  N  modules  of  type  M,  M  has  C  control  states  in 
it,  and  that  N  be  a  power  of  2.  Then 

cosi^rctnnp{A)  =  C^ 

because  we  may,  in  the  worst-case,  end-up  taking  a  full  cross-product  of  the  process  diagrams 
of  the  N  modules. 

Suppose  that  the  modules  formed  during  the  division  process  of  PARCOMP-DC,  M,  ..., 
^TLi  A  all  have  D  control  states  “on  the  average";  more  precisely  D  must  be  the  root 
mean  square  value  of  the  number  of  control  states.  Then 

cosi4Hircompjic(A)  =  log2(A^)  x  D^. 

This  is  because  we  are  doing  \ogj(N)  PARCOMPs  of  two  noodules  at  a  tin»e,  where  each  of 
these  modules  have,  D  control  states  as  a  root  mean  square.  Root  mean  square  is  needed 
becau^  we  are  squaring  D  within  the  summation  (we  are  doing  the  summation  log2{N) 
times).  We  assume  (as  is  the  case  in  our  data  structures)  that  copying  and  renaming  a 
process  description  has  negligible  cost. 

Firstly  we  note  that  D  does  not  tend  to  increase  as  the  size  of  the  modules  grow.  In  fact 
for  the  LRU  module  D  was  equal  to  C.  Thus  if  D  is  close  to  C  and  if  M  is  large,  then  there 
is  a  significant  payoff  by  using  PARCOMP-DC. 

The  behavior  inferred  by  PARCOMP-DC  for  large  arhythmic  arrays  is  not  very  intuitively 
understandable  for  human  readers.  We  show  the  result  of  doing  one-level  of  PARCOMP  for 
the  LRU  module  in  Appendix  A.  In  conclusion,  the  following  approach  is  suggested  for 
handling  arhythmic  arrays: 

•  Perform  PARCOMP  of  two  modules  of  the  array; 

•  Study  the  inferred  behavior  and  see  if  it  is  veri^ble  manually  or  through  exhaustive 
simulation;  (for  the  LRU  module,  we  discovered  a  sequencing  error  by  the  former 
technique.) 

•  Apply  PARCOMP  or  PARCOMP-DC  whichever  is  faster^.  The  behavior  inferred  by 
PARCOMP  (or  PARCOMP-DC)  will  have  complex  if-then-else  functions.  Construct 
tabular  functions  corresponding  to  these. 

•  Use  these  tabular  functions  for  efficient  simulation. 

•  Try  to  perform  formal  verification  of  the  whole  array  by  setting  up  an  induction. 

*We  may  race  them  and  pick  the  winner! 
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6  Concluding  Remarks 

6.1  A  Design  Methodology  Based  on  HOP 

The  following  steps  capture  a  top-down  design  methodology  that  is  currently  under  investi¬ 
gation.  While  our  investigation  is  still  in  its  infancy,  we  believe  it  to  be  important  to  show 
how  we  have  fused  verification  with  design. 

•  Write  the  requirements  specification  for  the  module  to  be  designed; 
s  Identify  the  subnaodiiles  that  are  to  constitute  the  realization; 

s  Write  requirements  specifications  for  the  submodules; 

•  Apply  PARCOMP  to  the  requirements  specifications  of  the  submodules; 

•  Follow  the  HOP  verification  methodology  and  verify  that  the  behavior  inferred  is  equiv¬ 
alent  to  the  original  requirements  specification; 

•  (Recursively)  invoke  the  HOP  design  methodology  on  the  submodules,  thereby  obtain¬ 
ing  a  circuit  and  a  design  specification  for  them; 

•  Match  the  requirements  specification  for  the  submodules  against  their  corresponding 
design  specifications;  obtain  a  definition  of  the  events  in  the  requirements  specification 
in  terms  of  the  events  at  the  design  level; 

•  Propagate  this  information  back  to  the  level  of  M  thereby  obtaining  a  design  specifi¬ 
cation  for  M; 

•  Apply  optimizing  transformations  to  M. 

In  many  of  the  above  steps,  we  believe  that  a  graphical  editor  for  process  diagrams  can 
be  gainfully  employed.  For  instance,  in  many  situations  a  user  could  graphically  specify 
a  highly  inefficient  but  functionally  correct  controller,  such  as  SCTL.  After  completing  a 
first  design  based  on  it,  the  optimized  version  PCTL  can  be  obtained.  Some  heuristics  are 
known  to  us:  e.ff.  “burping”  all  the  t3Ie  steps  in  the  inferred  process  by  overlapping  actions. 
In  principle,  the  approach  for  pipelining  calls  for  systematically  rearranging  events  labeling 
process  transitions  without  violating  observational  equivalence  between  requirements  and 
design  specifications.  In  this  way  we  hope  that  the  rigorous  semantics  of  HOP  would  help 
in  design  synthesis  and  optimization. 

6.2  Ongoing  and  Future  Work 

We  have  implemented  a  first  prototype  of  PARCOMP  to  study  its  performance  on  simple 
examples.  Based  on  this  experience,  we  are  now  engaged  in  developing  a  more  elaborate 
implementation  of  the  HOP  system.  The  implementation  uses  FROBS  [Mue87]  that  supports 
object-oriented  progranuning,  data  activated  daemons,  and  an  inference  engine.  This  would 
help  in  watching  simulation  results  in  a  very  flexible  manner.  Complex  trace  mechanisms, 
such  as  in  logic-state  analyzers,  can  be  built.  We  summarize  some  of  our  results  to  date  in 
appendix  B. 

Concurrently  we  are  engaged  in  the  concurrent  specification  and  design  of  a  large  ASIC 
called  the  RBC  [FTG88b,FTG88a].  Tackling  this  large  example  has  benefited  HOP  greatly. 
For  example,  several  arhythmic  arrays  are  present  in  the  RBC.  Many  important  bsues  such 
as  modeling  through  connections  satisfactorily,  supporting  grouping  of  ports  into  arrays  and 
records  of  ports  for  convenience,  etc.  are  quite  important  if  we  were  to  manage  the  complexity 


38 


of  a  large  specification.  We  are  aiming  towards  a  prototype  of  both  the  RBC  as  well  as  the 
HOP  system. 

A  preliminary  design  of  the  RBC  was  verified  by  hand  using  the  technique  illustrated  on 
the  stack.  Realizing  the  tedium  and  error-prone  nature  of  the  hand-proof,  we  plan  to  semi¬ 
automate  some  of  the  verification  steps  in  the  long  run.  We  plan  to  study  the  equational 
laws  of  HOP  as  well  as  investigate  notions  of  equivalence  among  HOP  processes. 
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?c  :  9  vector  of  ports  Algorithm;  set  row;  reset  col;  find 


Figure  18:  An  LRU  Matrix 

A  An  LRU  Matrix 

Figure  18  shows  a  unit  to  compute  the  “Least  Recently  Used”  location,  as  described  in 
[Tan87,  page  217].  The  specification  of  one  cell  of  this  unit,  lru.coll,  is  shown  in  figure  19. 

This  unit  is  meant  to  operate  inside  an  memory  management  unit  as  follows. 

Conceptually  the  algorithm  to  be  followed  is  the  following  two-phase  algorithm.  Initially 
the  matrix  starts  with  all  zeros.  Whenever  a  memory  address  b  accessed,  a  unary  represen¬ 
tation  of  the  address  is  fed  to  both  the  ?r  and  ?c  inputs.  The  bits  in  the  indexed  row  are  first 
set.  Thereafter  the  bits  in  the  indexed  column  are  reset.  The  LRU  is  always  pointed  by  that 
row  that  has  all  zeros.  After  fotu  distinct  address  accesses,  there  is  guaranteed  to  be  such 
a  row.  The  implementation  does  not  use  the  two-phase  algorithm  directly;  rather  priority 
logic  resident  in  each  lru.c«ll  decide  whether  to  set  or  to  reset  the  cell.  The  description 
in  figure  19  details  the  algorithm. 

We  performed  one  step  of  PARCOMP-DC  by  hand  and  obtained  the  description  shown 
in  figure  20  for  the  behavior  of  two  cells  acting  together.  This  specification  can  be  used  for 
simulation  and  design  verification.  Following  this,  we  can  mechanically  derive  the  behavior 
of  the  entire  array,  represent  it  as  a  tabular  function  and  employ  it  for  simulation. 
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*”*  This  ap«c.  claarly  shova  that  Irucall  changaa  ita  datapath 
atata  during  tha  riaing  adga  of  tha  clock  and  put a  out  a 

—  valua  on  port  inaztout  during  tha  falling  adga.  Thua  during  tha 
falling  adga,  tha  axtamal  latch  can  latch  in  tha  raaulta. 

—  Tha  "if**  functiona  uaad  harain  hava  obvioua  inplanantationa 

*■*  using  combinational  logic.  In  addition,  aach  Irucall  has  a  one-bit 

—  Flip-flop.  Details  available  from  the  author  upon  request. 

A6SPR0C  Irucall 

PORTS  ?pravin,  ?ro«in,  ?colin,  fnaztout  :  BIT 
CLOCKS  singlaphasa(ck,not(ck)) 

EVENTS  ckrisa  -  ck 

ckfall  «  not(ck) 

PROTOCOL 

Irucall  [dps]  <■  ckrisa,  vcolin  ■  ?colin,  vrovin  ■  ?rovin 

->  Irucalll  [  if(vcolin,0, 

if (vrovin,!, dps))  } 

Irucalll  [dps]  <■  ckfall,  vpravin  ■  ?pravin, 

Inaztout  ■  if (vpravin,!, dps)  ->  Irucall  [dps] 

END  Irucall 

Figure  19:  Specification  of  the  Irujceli  module 
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—  T«o  Inicclla  as  shewn  srs  subject  to  s  PAhCOKP.  The  result  is  shown  es  en  ebsproc. 
••  These  Teolin  ports  are  together  regarded  as  a  single  VECTOR  PORT  Tcolin. 

TeolinCO]  TcolinCl] 

I  I 

-»  w  w 

—  Trowin  |— — —  |  |— — —  i 

—  — >1  I— >1  I  Inertout 

—  I  dpsCOll  I  dpsClji — > 

->l  l->l  I 

I  I - 1  I - , 

I 

•>  god  (internally  grounded) 

ABSPROC  twolcs 

PORTS  Trowin.  TcolinCO],  TcolinCl].  fnextout  :  BIT 
CLOCKS  singlephaseCck.notCck)) 

EVENTS  ckrise  >  ck 

ckfall  ■  not(ck) 

PROTOCOL 

twolcs  C  dpsCO].  dpsCl]  ]  »  Fully  expanded  fora  dp8[0].[l]  is  for  convenience 
<■  ckrise.  vcolin  •  Tcolin.  wrowin  •  Trowin 
->  twolcsl  [  if (vcolinCO] .0. 

if (wrowin. 1. 

dps Co])), 
if (wcolinCl] .0. 
if (wrowin . 1 , 
dps  Cl]))  ] 

twolcsl  C  dps  Co],  dps  Cl]  ] 

<«  ckfall,  inextout  ■  if (wcolinCO] ,  if (wool in Cl] .0, 

if (wrowin, 1. dps [1]) 

). 

if (wrowin. 1, 

if (dps CO], 1. 

if(wcolinCl] ,0. 

if (wrowin. 1 .dps  Cl] ) 

))) 

). 

twolcs Cdps Co] .  dps  Cl]] 

These  if  fonu  say  be  cenwerted  into  a  tabular  function  that  is 
sequentially  searched  during  siaulation.  So  the  siuulator  doesn’t 
"  hawe  to  keep  track  of  the  internal  wires  of  the  LRU  aatrix.  and 

—  hence  becMses  efficient. 

END  Irucell 

Figure  20:  Specification  of  the  Irujcell  module 
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B  Results  To  Date 

•  An  initi&l  design  of  the  RBC  chip  has  been  verified  by  hand. 

•  PARCOMP  has  been  coded  in  Lisp.  Applying  it  to  the  stack  example  considered  in 
this  paper  resulted  in  the  inferred  absprocs  that  we  have  shown.  The  execution  time 
was  in  the  range  of  a  few  seconds. 

•  Other  modules  specified  in  HOP  include  a  IVanslation  Lookaside  Buffer  and  the  internal 
circuitry  of  the  RBC  chip  that  consists  of  about  ten  large  arhythmic  arrays. 
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C  A  Specification  of  PARCOMP 

j  Jjjput:  I  An  eacprcMion  Hid«  IfS  in  ||  for*  €  {l-m),  j  €  {l..n). 

Cj  are  conditional  processes  of  the  form  C,I^  =  if  then  rj[5'j(^]else  Fj(fcj(X7)] 
and  Pi  are  non-conditional  processes  of  the  form  Pip^  —  pi  :  initialsi  — »  P,(y,); 
E)ach  Pi  offers  a  set  of  initial  choices  initialsi  and  for  each  choice  pi  that  is  offered, 
the  future  behavior  of  P,  is  iZ,(y,).  HS  is  the  Hidden  Set,  the  set  of  events  and  ports 
hidden  from  the  parallel  composition. 

A  behaviorally  identical  process  Pp?i,...,^, ...]. 

_ j  A  done-list  is  maintained  for  each  parallel  composition  ||  {P,[Xi],...}  that 

has  already  been  computed.  Upon  getting  a  call  for  performing  parallel  composition, 
the  done-list  is  first  consulted. 

•  If  the  requested  parallel  composition  is  in  the  done-list,  return.  Else  enter  it  in  the 
done-list  and  proceed  as  follows. 

•  Combine  all  conditional  processes  into  one  conditional  process  C.  Combining  two 
conditional  processes  is  done  as  follows: 

CipSTl  =  if  9i  else  Pi[Ai(XD] 

C2\^i]  =  if  92  then  r2[92(^)l 

II  ^2!^  =  if  (91  A  92)  then  rj[9i(TD]  II  72[92(^)] 

else  if  (9,  A  not(92))  then  Ti[gx(T^]  H  p2l/i2(^)] 
else  ...etc.  {all  four  combinations) 

•  Now  we  are  left  with  the  task  of  computing  Hide  HS  in  ||  {P,[A,J, ...,  C}.  Let  C  be 
of  the  form 

if  9i  then  Ci(9i(^)]el8e  if  92  then  C2[g2{^)]ctc. 

II  C}  reduces  to  a  conditional  process  with  9,  as  the  conditions.  This 

conditional  has  in  it  parallel  compositions  of  the  form  ||  {PipQ, ...,  C,}.  that  is  (recur¬ 
sively)  computed.  Eventually  we  are  laced  with  composing  non-conditional  processes 
in  parallel.  We  take  this  up  next, 
e  Consider  ||  Let  each  Pi  be 

pm  = 

I  CO? 

I  ... 

I  co”' flri/rTO) 

•  Generate  tuples 

r  =<  caf>,  caf*,  ...ca'"  > 


Output: 

Method: 
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i.e.  a  tuple  of  the  xith  initial  compound  action  offered  by  Pi,  the  xjth  initial  compound 
action  offered  by  P3,  etc.  This  tuple  T  is  assumed  to  be  the  irreducible  form  arrived  at 
after  applying  the  action  product  rules  of  figure  8.  According  to  the  rule  for  parallel 
composition  Parcomp  all  such  tuples  would  become  the  initial  choices  of  the  resultant 
process.  Following  such  choices,  the  resultant  process  would  continue  to  behave  like 
11  }•  However  using  the  hiding  information  HS,  we  can 

prune  noany  of  these  choices.  In  particular. 


-  those  tuples  T  that  contain  unsynchroniztd  events  e  or  ?  that  belong  to  HS  are 
dropped,  and  the  corresponding  arm  of  the  synchronization  tree  is  pruned; 

-  those  tuples  T  that  contain  synchronized  events  f  that  belong  to  HS  are  replaced 
by  T[idle/f\. 


•  In  computing 

the  bindings  generated  by  taking  action  products  of  the  members  of  T  are  taken  into 
account.  Specifically,  we  construct  a  lei  block  containing  these  bindings.  □ 
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